ronaldeddings commented 1 month ago

I'm experiencing high CPU and memory utilization from the screenpipe process

Under settings, application is showing status "healthy"

Any updates would be greatly appreciated so that I can run screenpipe with other memory intensive applications at the same time

louis030195 commented 1 month ago

@ronaldeddings thanks for the feedback

in my experience screenpipe uses 8 gb memory (macbook pro m3 max 32 gb) but it can be possible that we have spike usages

we will look into this 🙏

PS: atm you can reduce CPU usage (and probably memory) by using dev mode and reducing the --fps arg (default 0.2 on mac which is 1 frame every 5 second, 0.1 would be 1 frame every 10 s, 1 fps is 1 frame per second, higher frequency = higher CPU/memory usage), or disable audio --disable-audio.

you can also use cloud OCR using --ocr-engine unstructured or cloud STT using --audio-transcription-engine deepgram (they also provide higher quality)

we provide free cloud usage for a few months

I think biggest consumption is OCR atm (could be wrong)

we're going to make these settings available in non dev mode soon

thanks for the patience 🙏

algora-pbc[bot] commented 3 weeks ago

💎 $150 bounty • Screenpi.pe

Steps to solve:

Start working: Comment /attempt #183 with your implementation plan
Submit work: Create a pull request including /claim #183 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to mediar-ai/screenpipe!

Add a bounty • Share on socials

m13v commented 3 weeks ago

Memory leak is the priority in this issue [addressed to bounty contributors]

chandeldivyam commented 3 weeks ago

Hi @m13v @louis030195 Can you give a little brief on how to reproduce this? It is possible to do it on a windows computer?

m13v commented 3 weeks ago

I think it the same on all os, just keep running screenpipe, and it will start accumulating more and more operating memory. First it’s 1gb, in 10 minutes it’s 2gb. In 1 hour it’s 4gb..

On Fri, Aug 23, 2024 at 1:56 AM Divyam Chandel @.***> wrote:

Hi @m13v https://github.com/m13v @louis030195 https://github.com/louis030195 Can you give a little brief on how to reproduce this? It is possible to do it on a windows computer?

— Reply to this email directly, view it on GitHub https://github.com/mediar-ai/screenpipe/issues/183#issuecomment-2306624922, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY62CDFV4QNOWG6SOZI6543ZS32KNAVCNFSM6AAAAABMWRKIQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBWGYZDIOJSGI . You are receiving this because you were mentioned.Message ID: @.***>

chandeldivyam commented 3 weeks ago

Been running for 21 minutes, when it started it was 300MB only. What am I doing wrong? Because video is not being captured, checked it in the database and the .screenpipe/data as well, no files are being created. Also, the files are not being created via the app that also I checked. I was wondering how could I also change the transcription model to maybe medium. And maybe the problem is it is using orc engine as Tesseract instead of native ocr?

(base) PS D:\projects\screen-pipe> .\target\release\screenpipe.exe
[2024-08-23T09:23:59Z WARN  screenpipe] Screenpipe hasn't been extensively tested on this OS. We'd love your feedback!
Would love your feedback on the UX, let's a 15 min call soon:
https://cal.com/louis030195/screenpipe
[2024-08-23T09:23:59Z INFO  screenpipe]   Microphone (4- High Definition Audio Device) (input)
[2024-08-23T09:23:59Z INFO  screenpipe]   BenQ EL2870U (NVIDIA High Definition Audio) (output)
[2024-08-23T09:23:59Z INFO  screenpipe_server::db] Migrations executed successfully.
[2024-08-23T09:23:59Z INFO  screenpipe] Database initialized, will store files in C:\Users\ABC\.screenpipe
[2024-08-23T09:23:59Z INFO  screenpipe] Server started on http://localhost:3030

                                            _
   __________________  ___  ____     ____  (_____  ___
  / ___/ ___/ ___/ _ \/ _ \/ __ \   / __ \/ / __ \/ _ \
 (__  / /__/ /  /  __/  __/ / / /  / /_/ / / /_/ /  __/
/____/\___/_/   \___/\___/_/ /_/  / .___/_/ .___/\___/
                                 /_/     /_/

Build AI apps that have the full context
Open source | Runs locally | Developer friendly

┌─────────────────────┬────────────────────────────────────┐
│ Setting             │ Value                              │
├─────────────────────┼────────────────────────────────────┤
│ FPS                 │ 1                                  │
│ Audio Chunk Duration│ 30 seconds                         │
│ Port                │ 3030                               │
│ Audio Disabled      │ false                              │
│ Self Healing        │ false                              │
│ Save Text Files     │ false                              │
│ Audio Engine        │ WhisperTiny                        │
│ OCR Engine          │ Tesseract                          │
│ Monitor ID          │ 65537                              │
│ Data Directory      │ C:\Users\ABC\.screenpipe           │
│ Debug Mode          │ false                              │
├─────────────────────┼────────────────────────────────────┤
│ Audio Devices       │                                    │
[2024-08-23T09:23:59Z INFO  screenpipe_server::server] Starting server on 0.0.0.0:3030
│                     │ Microphone (4- High Definition ... │
│                     │ BenQ EL2870U (NVIDIA High Defin... │
└─────────────────────┴────────────────────────────────────┘
You are using local processing. All your data stays on your computer.

[2024-08-23T09:24:00Z INFO  screenpipe_audio::stt] device = Cuda(CudaDevice(DeviceId(1)))
[2024-08-23T09:24:00Z INFO  hf_hub] Token file not found "C:\\Users\\ABC\\.cache\\huggingface\\token"
[2024-08-23T09:24:00Z INFO  screenpipe_server::video] Starting new video capture
[2024-08-23T09:24:00Z INFO  screenpipe_server::video] Started capture thread
[2024-08-23T09:24:10Z INFO  screenpipe_server::resource_monitor] Runtime: 10s, Total Memory: 2% (0.45 GB / 23.94 GB), Total CPU: 102%
[2024-08-23T09:24:14Z INFO  screenpipe_audio::core] device: "Microphone (4- High Definition Audio Device) (input)"
[2024-08-23T09:24:14Z INFO  screenpipe_audio::core] device: "BenQ EL2870U (NVIDIA High Definition Audio) (output)"
[2024-08-23T09:24:14Z INFO  screenpipe_audio::core] Recording Microphone (4- High Definition Audio Device) (input) for 30 seconds
[2024-08-23T09:24:14Z INFO  screenpipe_audio::core] Recording BenQ EL2870U (NVIDIA High Definition Audio) (output) for 30 seconds
[2024-08-23T09:24:20Z INFO  screenpipe_server::resource_monitor] Runtime: 20s, Total Memory: 2% (0.44 GB / 23.94 GB), Total CPU: 107%
[2024-08-23T09:24:30Z INFO  screenpipe_server::resource_monitor] Runtime: 30s, Total Memory: 2% (0.45 GB / 23.94 GB), Total CPU: 102%
[2024-08-23T09:24:40Z INFO  screenpipe_server::resource_monitor] Runtime: 40s, Total Memory: 2% (0.45 GB / 23.94 GB), Total CPU: 102%
[2024-08-23T09:24:44Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-24-14.mp4. Now triggering transcription
[2024-08-23T09:24:44Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Microphone (4- High Definition Audio Device) (input) (iteration 1)
[2024-08-23T09:24:44Z INFO  screenpipe_server::core] Recording complete for device Microphone (4- High Definition Audio Device) (input) (iteration 1): "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-24-14.mp4"
[2024-08-23T09:24:44Z INFO  screenpipe_server::core] Finished iteration 1 for device Microphone (4- High Definition Audio Device) (input)
[2024-08-23T09:24:44Z INFO  screenpipe_audio::core] device: "Microphone (4- High Definition Audio Device) (input)"
[2024-08-23T09:24:44Z INFO  screenpipe_audio::core] Recording Microphone (4- High Definition Audio Device) (input) for 30 seconds
[2024-08-23T09:24:44Z INFO  screenpipe_audio::stt] Resampling from 44100 Hz to 16000 Hz
[2024-08-23T09:24:44Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3002, frames that include speech: 256
[2024-08-23T09:24:44Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\BenQ EL2870U (NVIDIA High Definition Audio) (output)_2024-08-23_09-24-14.mp4. Now triggering transcription
[2024-08-23T09:24:44Z INFO  screenpipe_server::core] Finished record_and_transcribe for device BenQ EL2870U (NVIDIA High Definition Audio) (output) (iteration 1)
[2024-08-23T09:24:44Z INFO  screenpipe_server::core] Recording complete for device BenQ EL2870U (NVIDIA High Definition Audio) (output) (iteration 1): "C:\\Users\\ABC\\.screenpipe\\data\\BenQ EL2870U (NVIDIA High Definition Audio) (output)_2024-08-23_09-24-14.mp4"
[2024-08-23T09:24:44Z INFO  screenpipe_server::core] Finished iteration 1 for device BenQ EL2870U (NVIDIA High Definition Audio) (output)
[2024-08-23T09:24:44Z INFO  screenpipe_audio::core] device: "BenQ EL2870U (NVIDIA High Definition Audio) (output)"
[2024-08-23T09:24:45Z INFO  screenpipe_audio::core] Recording BenQ EL2870U (NVIDIA High Definition Audio) (output) for 30 seconds
[2024-08-23T09:24:45Z INFO  screenpipe_audio::multilingual] detected language: ("nn", "nynorsk")
[2024-08-23T09:24:45Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T09:24:45Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T09:24:45Z INFO  screenpipe_audio::stt]   0.0s-2.0s:  See you next time!
[2024-08-23T09:24:45Z ERROR screenpipe_audio::stt] STT error for input C:\Users\ABC\.screenpipe\data\BenQ EL2870U (NVIDIA High Definition Audio) (output)_2024-08-23_09-24-14.mp4: no supported audio tracks
[2024-08-23T09:24:45Z INFO  screenpipe_server::core] Received transcription

@m13v

louis030195 commented 3 weeks ago

@chandeldivyam

[2024-08-23T09:24:45Z ERROR screenpipe_audio::stt] STT error for input C:\Users\ABC.screenpipe\data\BenQ EL2870U (NVIDIA High Definition Audio) (output)_2024-08-23_09-24-14.mp4: no supported audio tracks

there is an error with audio

louis030195 commented 3 weeks ago

@m13v @chandeldivyam the first task for this issue is to have repeatable measurement of performance, otherwise we're just optimising blindly

example to track accuracy & speed of OCR

https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/benches/ocr_benchmark.rs

https://mediar-ai.github.io/screenpipe/dev/bench/

chandeldivyam commented 3 weeks ago

Thats actually very weird thing, in windows or just my computer I am not sure. cpal goes crazy.

So, if there is no audio ( i am not on a call or watching a youtube video), basically no system audio. Then there would be no callback from cpal to the stream. It is just for my computer or generally for windows I am not sure.

So in a previous project, I had artificially added some vectors without sound.

Check the write_to_file at the end, added silence so that this issue doesn't come up.

I will start a youtube video in the background for the time being to see the memory issue, but I am very certain about why the current error is there.

chandeldivyam commented 3 weeks ago

So, I started a video i.e. now the output device has audio. We get rid of that error now.

(base) PS D:\projects\screen-pipe> .\target\release\screenpipe.exe
[2024-08-23T09:53:42Z WARN  screenpipe] Screenpipe hasn't been extensively tested on this OS. We'd love your feedback!
Would love your feedback on the UX, let's a 15 min call soon:
https://cal.com/louis030195/screenpipe
[2024-08-23T09:53:42Z INFO  screenpipe]   Microphone (4- High Definition Audio Device) (input)
[2024-08-23T09:53:42Z INFO  screenpipe]   Headphones (4- High Definition Audio Device) (output)
[2024-08-23T09:53:42Z INFO  screenpipe_server::db] Migrations executed successfully.
[2024-08-23T09:53:42Z INFO  screenpipe] Database initialized, will store files in C:\Users\ABC\.screenpipe
[2024-08-23T09:53:42Z INFO  screenpipe] Server started on http://localhost:3030

                                            _
   __________________  ___  ____     ____  (_____  ___
  / ___/ ___/ ___/ _ \/ _ \/ __ \   / __ \/ / __ \/ _ \
 (__  / /__/ /  /  __/  __/ / / /  / /_/ / / /_/ /  __/
/____/\___/_/   \___/\___/_/ /_/  / .___/_/ .___/\___/
                                 /_/     /_/

[2024-08-23T09:53:42Z INFO  screenpipe_server::server] Starting server on 0.0.0.0:3030

Build AI apps that have the full context
Open source | Runs locally | Developer friendly

┌─────────────────────┬────────────────────────────────────┐
│ Setting             │ Value                              │
├─────────────────────┼────────────────────────────────────┤
│ FPS                 │ 1                                  │
│ Audio Chunk Duration│ 30 seconds                         │
│ Port                │ 3030                               │
│ Audio Disabled      │ false                              │
│ Self Healing        │ false                              │
│ Save Text Files     │ false                              │
│ Audio Engine        │ WhisperTiny                        │
│ OCR Engine          │ Tesseract                          │
│ Monitor ID          │ 65537                              │
│ Data Directory      │ C:\Users\ABC\.screenpipe           │
│ Debug Mode          │ false                              │
├─────────────────────┼────────────────────────────────────┤
│ Audio Devices       │                                    │
│                     │ Microphone (4- High Definition ... │
│                     │ Headphones (4- High Definition ... │
└─────────────────────┴────────────────────────────────────┘
You are using local processing. All your data stays on your computer.

[2024-08-23T09:53:42Z INFO  screenpipe_audio::stt] device = Cuda(CudaDevice(DeviceId(1)))
[2024-08-23T09:53:42Z INFO  hf_hub] Token file not found "C:\\Users\\ABC\\.cache\\huggingface\\token"
[2024-08-23T09:53:42Z INFO  screenpipe_server::video] Starting new video capture
[2024-08-23T09:53:42Z INFO  screenpipe_server::video] Started capture thread
[2024-08-23T09:53:53Z INFO  screenpipe_server::resource_monitor] Runtime: 10s, Total Memory: 2% (0.41 GB / 23.94 GB), Total CPU: 108%
[2024-08-23T09:53:57Z INFO  screenpipe_audio::core] device: "Microphone (4- High Definition Audio Device) (input)"
[2024-08-23T09:53:57Z INFO  screenpipe_audio::core] device: "Headphones (4- High Definition Audio Device) (output)"
[2024-08-23T09:53:57Z INFO  screenpipe_audio::core] Recording Microphone (4- High Definition Audio Device) (input) for 30 seconds
[2024-08-23T09:53:57Z INFO  screenpipe_audio::core] Recording Headphones (4- High Definition Audio Device) (output) for 30 seconds
[2024-08-23T09:54:03Z INFO  screenpipe_server::resource_monitor] Runtime: 20s, Total Memory: 2% (0.44 GB / 23.94 GB), Total CPU: 105%
[2024-08-23T09:54:13Z INFO  screenpipe_server::resource_monitor] Runtime: 30s, Total Memory: 2% (0.44 GB / 23.94 GB), Total CPU: 105%
[2024-08-23T09:54:23Z INFO  screenpipe_server::resource_monitor] Runtime: 40s, Total Memory: 2% (0.44 GB / 23.94 GB), Total CPU: 104%
[2024-08-23T09:54:27Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-53-57.mp4. Now triggering transcription
[2024-08-23T09:54:27Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Microphone (4- High Definition Audio Device) (input) (iteration 1)
[2024-08-23T09:54:27Z INFO  screenpipe_server::core] Recording complete for device Microphone (4- High Definition Audio Device) (input) (iteration 1): "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-53-57.mp4"
[2024-08-23T09:54:27Z INFO  screenpipe_server::core] Finished iteration 1 for device Microphone (4- High Definition Audio Device) (input)
[2024-08-23T09:54:27Z INFO  screenpipe_audio::core] device: "Microphone (4- High Definition Audio Device) (input)"
[2024-08-23T09:54:27Z INFO  screenpipe_audio::core] Recording Microphone (4- High Definition Audio Device) (input) for 30 seconds
[2024-08-23T09:54:27Z INFO  screenpipe_audio::stt] Resampling from 44100 Hz to 16000 Hz
[2024-08-23T09:54:27Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-53-57.mp4. Now triggering transcription
[2024-08-23T09:54:27Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Headphones (4- High Definition Audio Device) (output) (iteration 1)
[2024-08-23T09:54:27Z INFO  screenpipe_server::core] Recording complete for device Headphones (4- High Definition Audio Device) (output) (iteration 1): "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-53-57.mp4"
[2024-08-23T09:54:27Z INFO  screenpipe_server::core] Finished iteration 1 for device Headphones (4- High Definition Audio Device) (output)
[2024-08-23T09:54:27Z INFO  screenpipe_audio::core] device: "Headphones (4- High Definition Audio Device) (output)"
[2024-08-23T09:54:27Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3002, frames that include speech: 385
[2024-08-23T09:54:27Z INFO  screenpipe_audio::core] Recording Headphones (4- High Definition Audio Device) (output) for 30 seconds
[2024-08-23T09:54:27Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T09:54:27Z INFO  screenpipe_audio::stt] no speech detected, skipping 3000 DecodingResult { tokens: [50258, 50259, 50359, 50364, 307, 322, 264, 6191, 3199, 13, 50464, 50257], text: "<|0.00|> is on the technical table.<|2.00|>", avg_logprob: -1.3954070615976575, no_speech_prob: 0.7845484018325806, temperature: 0.0, compression_ratio: NaN }
[2024-08-23T09:54:27Z INFO  screenpipe_audio::stt] Resampling from 48000 Hz to 16000 Hz
[2024-08-23T09:54:28Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3001, frames that include speech: 2204
[2024-08-23T09:54:28Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T09:54:28Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-53-57.mp4"
[2024-08-23T09:54:28Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T09:54:29Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T09:54:29Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T09:54:29Z INFO  screenpipe_audio::stt]   0.0s-21.8s:  is the longest podcast I've ever done. It's a fascinating, super technical and wide-ranging conversation. And I loved every minute of it. And now dear friends, here's Elon Musk. It's fifth time on this, the Lex Friedman podcast. Drink a cup of your water. Water. I'm so over caffeinated right now. Do you want some caffeine? I mean, sure. There's a, there's a nitro drink.
[2024-08-23T09:54:29Z INFO  screenpipe_audio::stt] no speech detected, skipping 4500 DecodingResult { tokens: [50258, 50259, 50359, 50364, 291, 13, 50464, 50257], text: "<|0.00|> you.<|2.00|>", avg_logprob: -1.3437390944148888, no_speech_prob: 0.9409240484237671, temperature: 0.0, compression_ratio: NaN }
[2024-08-23T09:54:29Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T09:54:29Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-53-57.mp4"
[2024-08-23T09:54:29Z INFO  screenpipe_server::db] Successfully chunked audio transcription into 3 chunks
[2024-08-23T09:54:33Z INFO  screenpipe_server::resource_monitor] Runtime: 50s, Total Memory: 2% (0.56 GB / 23.94 GB), Total CPU: 139%
[2024-08-23T09:54:53Z INFO  screenpipe_server::resource_monitor] Runtime: 70s, Total Memory: 2% (0.56 GB / 23.94 GB), Total CPU: 107%
[2024-08-23T09:54:57Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-54-27.mp4. Now triggering transcription
[2024-08-23T09:54:57Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Microphone (4- High Definition Audio Device) (input) (iteration 2)
[2024-08-23T09:54:57Z INFO  screenpipe_server::core] Recording complete for device Microphone (4- High Definition Audio Device) (input) (iteration 2): "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-54-27.mp4"
[2024-08-23T09:54:57Z INFO  screenpipe_server::core] Finished iteration 2 for device Microphone (4- High Definition Audio Device) (input)
[2024-08-23T09:54:57Z INFO  screenpipe_audio::core] device: "Microphone (4- High Definition Audio Device) (input)"
[2024-08-23T09:54:57Z INFO  screenpipe_audio::core] Recording Microphone (4- High Definition Audio Device) (input) for 30 seconds
[2024-08-23T09:54:57Z INFO  screenpipe_audio::stt] Resampling from 44100 Hz to 16000 Hz
[2024-08-23T09:54:57Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3002, frames that include speech: 397
[2024-08-23T09:54:57Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-54-27.mp4. Now triggering transcription
[2024-08-23T09:54:57Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Headphones (4- High Definition Audio Device) (output) (iteration 2)
[2024-08-23T09:54:57Z INFO  screenpipe_server::core] Recording complete for device Headphones (4- High Definition Audio Device) (output) (iteration 2): "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-54-27.mp4"
[2024-08-23T09:54:57Z INFO  screenpipe_server::core] Finished iteration 2 for device Headphones (4- High Definition Audio Device) (output)
[2024-08-23T09:54:57Z INFO  screenpipe_audio::core] device: "Headphones (4- High Definition Audio Device) (output)"
[2024-08-23T09:54:57Z INFO  screenpipe_audio::core] Recording Headphones (4- High Definition Audio Device) (output) for 30 seconds
[2024-08-23T09:54:57Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T09:54:58Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T09:54:58Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T09:54:58Z INFO  screenpipe_audio::stt]   0.0s-4.0s:  I need no idea how to get rid of it. I need no idea how to get rid of it. I think I have to really everything.
[2024-08-23T09:54:58Z INFO  screenpipe_audio::stt] Resampling from 48000 Hz to 16000 Hz
[2024-08-23T09:54:58Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3003, frames that include speech: 2385
[2024-08-23T09:54:58Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T09:54:58Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-54-27.mp4"
[2024-08-23T09:54:58Z INFO  screenpipe_server::db] Successfully chunked audio transcription into 1 chunks
[2024-08-23T09:54:58Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T09:54:59Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T09:54:59Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T09:54:59Z INFO  screenpipe_audio::stt]   0.0s-...:  This was to keep you up for like, you know, tomorrow afternoon basically. Yeah, I don't know. So what is nitro? It's just got a lot of caffeine. Don't ask questions. It's called nitro. Do you need to know anything else? It's got nitrogen. That's ridiculous. I mean, what we breathe is 78% not just anyway. What do you need to add more? What's the most people think they have the really oxygen and they're actually breathing 70%?
[2024-08-23T09:54:59Z INFO  screenpipe_audio::stt] no speech detected, skipping 4500 DecodingResult { tokens: [50258, 50259, 50359, 50364, 291, 13, 50464, 50257], text: "<|0.00|> you.<|2.00|>", avg_logprob: -1.3308436969938136, no_speech_prob: 0.9387331008911133, temperature: 0.0, compression_ratio: NaN }
[2024-08-23T09:54:59Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T09:54:59Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-54-27.mp4"
[2024-08-23T09:54:59Z INFO  screenpipe_server::db] Successfully chunked audio transcription into 3 chunks
[2024-08-23T09:55:03Z INFO  screenpipe_server::resource_monitor] Runtime: 80s, Total Memory: 2% (0.56 GB / 23.94 GB), Total CPU: 139%
[2024-08-23T09:55:13Z INFO  screenpipe_server::resource_monitor] Runtime: 90s, Total Memory: 2% (0.56 GB / 23.94 GB), Total CPU: 104%
[2024-08-23T09:55:23Z INFO  screenpipe_server::resource_monitor] Runtime: 100s, Total Memory: 3% (0.62 GB / 23.94 GB), Total CPU: 102%
[2024-08-23T09:55:27Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-54-57.mp4. Now triggering transcription
[2024-08-23T09:55:27Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Microphone (4- High Definition Audio Device) (input) (iteration 3)
[2024-08-23T09:55:27Z INFO  screenpipe_server::core] Recording complete for device Microphone (4- High Definition Audio Device) (input) (iteration 3): "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-54-57.mp4"
[2024-08-23T09:55:27Z INFO  screenpipe_server::core] Finished iteration 3 for device Microphone (4- High Definition Audio Device) (input)
[2024-08-23T09:55:27Z INFO  screenpipe_audio::core] device: "Microphone (4- High Definition Audio Device) (input)"
[2024-08-23T09:55:27Z INFO  screenpipe_audio::core] Recording Microphone (4- High Definition Audio Device) (input) for 30 seconds
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt] Resampling from 44100 Hz to 16000 Hz
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3002, frames that include speech: 404
[2024-08-23T09:55:27Z INFO  screenpipe_audio::core] Recording stopped, wrote to C:\Users\ABC\.screenpipe\data\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-54-57.mp4. Now triggering transcription
[2024-08-23T09:55:27Z INFO  screenpipe_server::core] Finished record_and_transcribe for device Headphones (4- High Definition Audio Device) (output) (iteration 3)
[2024-08-23T09:55:27Z INFO  screenpipe_server::core] Recording complete for device Headphones (4- High Definition Audio Device) (output) (iteration 3): "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-54-57.mp4"
[2024-08-23T09:55:27Z INFO  screenpipe_server::core] Finished iteration 3 for device Headphones (4- High Definition Audio Device) (output)
[2024-08-23T09:55:27Z INFO  screenpipe_audio::core] device: "Headphones (4- High Definition Audio Device) (output)"
[2024-08-23T09:55:27Z INFO  screenpipe_audio::core] Recording Headphones (4- High Definition Audio Device) (output) for 30 seconds
[2024-08-23T09:55:27Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt]   0.0s-1.0s:  Thank you.
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt] Resampling from 48000 Hz to 16000 Hz
[2024-08-23T09:55:27Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3003, frames that include speech: 2266
[2024-08-23T09:55:28Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T09:55:28Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Microphone (4- High Definition Audio Device) (input)_2024-08-23_09-54-57.mp4"
[2024-08-23T09:55:28Z INFO  screenpipe_server::db] Successfully chunked audio transcription into 1 chunks
[2024-08-23T09:55:28Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T09:55:28Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T09:55:28Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T09:55:28Z INFO  screenpipe_audio::stt]   0.0s-8.4s:  you need like a mocha like from like from clockwork orange yeah is that top three
[2024-08-23T09:55:28Z INFO  screenpipe_audio::stt]   8.4s-18.2s:  Kubrick film for you like we're just pretty good I mean it's meant it jarring okay so first
[2024-08-23T09:55:28Z INFO  screenpipe_audio::stt]   18.2s-22.5s:  let's step back and big congrats on getting your link and plant it into a human and
[2024-08-23T09:55:28Z INFO  screenpipe_audio::stt] no speech detected, skipping 4500 DecodingResult { tokens: [50258, 50259, 50359, 50364, 291, 13, 50464, 50257], text: "<|0.00|> you.<|2.00|>", avg_logprob: -1.3299248402253174, no_speech_prob: 0.9385878443717957, temperature: 0.0, compression_ratio: NaN }
[2024-08-23T09:55:28Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T09:55:28Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_09-54-57.mp4"
[2024-08-23T09:55:28Z INFO  screenpipe_server::db] Successfully chunked audio transcription into 2 chunks
[2024-08-23T09:55:33Z INFO  screenpipe_server::resource_monitor] Runtime: 110s, Total Memory: 3% (0.62 GB / 23.94 GB), Total CPU: 129%

chandeldivyam commented 3 weeks ago

@louis030195

example to track accuracy & speed of OCR

Checking this, what are your thoughts currently? How could we have a repeatable behavior?

Maybe we run individual functions in a proctored env like you did for bench testing of ocr and vision? And plot a graph of memory / other vitals wtr time? That might be the first step for us to find? Also, that would be a good test case for later?

louis030195 commented 3 weeks ago

i think we can start by measuring small parts (esp. those that dont need a monitor or audio device so it can run in CI) that likely use lot of compute

then if we can find out a repeatable way to measure perf on local computer (e.g. run some command that make screenpipe run for X mins and at the end we know memory & cpu over time, average, spikes, etc.)

then if we can find way to simulate monitor and/or audio device in ci would be good

also we have this for runtime perf: https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-server/src/resource_monitor.rs

but it's not super helpful atm, added feat to log to disk (then you can feed this into chatgpt to analyse perf specifically)

louis030195 commented 3 weeks ago

@m13v @chandeldivyam the first task for this issue is to have repeatable measurement of performance, otherwise we're just optimising blindly

example to track accuracy & speed of OCR

https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/benches/ocr_benchmark.rs

https://mediar-ai.github.io/screenpipe/dev/bench/

i increased apple ocr speed by 30% already and increased accuracy by 25% now

chandeldivyam commented 3 weeks ago

i think we can start by measuring small parts (esp. those that dont need a monitor or audio device so it can run in CI) that likely use lot of compute

Right, which parts would you suggest we start the benchmarking with first?

Also, how to run vision part as well from the terminal with windows native ocr? Currently I am not passing any arugments: .\target\release\screenpipe.exe

The memory is constant at ~300mb, for a minute or so it went to around 600mb but came down again (whatever it was, it must have been dropped after the process). Been there for last ~30 minutes. With no error in the console.

@louis030195

chandeldivyam commented 3 weeks ago

After about 60 iterations, windows sent it to efficiency mode. Using around 45mb memory

Currently at 87 iteration, still same memory utilizaiton. @m13v @louis030195

Audio files are being recorded, also transcriptions are being inserted to the db

louis030195 commented 3 weeks ago

@chandeldivyam

check args

screenpipe -h

screenpipe --ocr-engine windows-native

louis030195 commented 3 weeks ago

looking at apple Instruments, it seems the "chunking" part of screenpipe uses ton of CPU

will push a benchmark for this

louis030195 commented 3 weeks ago

@chandeldivyam also if you can fix the windows version of this:

https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/benches/ocr_benchmark.rs

that would be great

cargo bench --bench ocr_benchmark

to run

chandeldivyam commented 3 weeks ago

@louis030195

check args

Have been running with -> .\target\release\screenpipe.exe --ocr-engine windows-native --audio-transcription-engine whisper-large for 64 iterations (32 minutes)

It gradually increased from 300mb to 1500mb but then suddenly came back to 300mb again. GPU VRAM is ~3 GB from screenpipe.

@chandeldivyam also if you can fix the windows version of this:

Checking this, will raise a PR

louis030195 commented 3 weeks ago

@m13v i will push new version i disabled chunking because it:

is not used (e.g. end user does not get any value, dead code atm)
takes 200% CPU

until it can be used and is useful and solve this #110

*memory is another problem

chandeldivyam commented 3 weeks ago

looking at apple Instruments, it seems the "chunking" part of screenpipe uses ton of CPU

Yes, similar observation in windows. Can not pinpoint what it is but every 30 seconds when there are these logs:

[2024-08-23T11:31:41Z INFO  screenpipe_server::core] Finished iteration 75 for device Headphones (4- High Definition Audio Device) (output)
[2024-08-23T11:31:41Z INFO  screenpipe_audio::core] device: "Headphones (4- High Definition Audio Device) (output)"
[2024-08-23T11:31:41Z INFO  screenpipe_audio::core] Recording Headphones (4- High Definition Audio Device) (output) for 30 seconds
[2024-08-23T11:31:41Z INFO  screenpipe_audio::stt] Resampling from 48000 Hz to 16000 Hz
[2024-08-23T11:31:41Z INFO  screenpipe_audio::stt] Total audio_frames processed: 3003, frames that include speech: 2519
[2024-08-23T11:31:41Z INFO  screenpipe_server::resource_monitor] Runtime: 2270s, Total Memory: 5% (1.25 GB / 23.94 GB), Total CPU: 17%
[2024-08-23T11:31:42Z INFO  screenpipe_audio::multilingual] detected language: ("en", "english")
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt] 0.0s -- 30.0s
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   0.0s-6.9s:  for blind people so can you speak to stimulating the visual cortex I mean the
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   6.9s-12.0s:  possibilities there are just incredible to be able to give that gift back to people who
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   12.0s-17.2s:  don't have sight or even any aspect of that can you just speak to the challenges of
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   17.2s-21.5s:  there's several challenges here many one of which is like you said from
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   21.5s-25.2s:  recording to the stimulation just any aspect of that
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt] 30.0s -- 45.0s
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   0.0s-0.0s:
[2024-08-23T11:31:43Z INFO  screenpipe_audio::stt]   0.0s-1.0s:  You know,
[2024-08-23T11:31:43Z INFO  screenpipe_server::core] Received transcription
[2024-08-23T11:31:43Z INFO  screenpipe_server::core] Inserting audio chunk: "C:\\Users\\ABC\\.screenpipe\\data\\Headphones (4- High Definition Audio Device) (output)_2024-08-23_11-31-11.mp4"
[2024-08-23T11:31:43Z INFO  screenpipe_server::db] Successfully chunked audio transcription into 2 chunks

There is a spike in CPU utilization.

chandeldivyam commented 3 weeks ago

@chandeldivyam also if you can fix the windows version of this:

https://github.com/mediar-ai/screenpipe/pull/207

@louis030195

chandeldivyam commented 3 weeks ago

Ran the screenpipe .\target\release\screenpipe.exe --ocr-engine windows-native --audio-transcription-engine whisper-large for over an hour. For some time it increased, then came back to normal again.

Initially 400mb -> 1500mb (for decent amount of time) -> 400mb (again for a decent amount of time)

But, there is no image being captured. Checked the db and storage location. What could the issue be? There is no error log either.

Neither is image capture running from the app.

louis030195 commented 3 weeks ago

@chandeldivyam never saw this issue before EXCEPT when windows defender decide to delete screenpipe or similar

so there is no logs error or something related to vision? you can try adding --debug also to have more info

chandeldivyam commented 3 weeks ago

so there is no logs error or something related to vision? you can try adding --debug also to have more info

will check this

chandeldivyam commented 3 weeks ago

[2024-08-23T13:41:09Z DEBUG screenpipe_server::db] OCR text inserted into db successfully
[2024-08-23T13:41:09Z ERROR xcap::platform::impl_window] Access is denied. (0x80070005)
[2024-08-23T13:41:09Z ERROR xcap::platform::impl_window] Access is denied. (0x80070005)

Maybe this could be the reason? I can see entries in ocr_text table

In video_chunks I can seeC:\Users\ABC\.screenpipe\data\2024-08-23_13-40-52.mp4 but this file is 0KB

Maybe some permission issue?

@louis030195

louis030195 commented 3 weeks ago

btw for mac one of the memory issue i suspect is:

https://github.com/RustAudio/cpal/pull/894/files this we rely on for macos audio output
cpal lib in general based on xcode instruments leaks

to confirm

louis030195 commented 3 weeks ago

[2024-08-23T13:41:09Z DEBUG screenpipe_server::db] OCR text inserted into db successfully
[2024-08-23T13:41:09Z ERROR xcap::platform::impl_window] Access is denied. (0x80070005)
[2024-08-23T13:41:09Z ERROR xcap::platform::impl_window] Access is denied. (0x80070005)
Maybe this could be the reason? I can see entries in ocr_text table

In video_chunks I can seeC:\Users\ABC\.screenpipe\data\2024-08-23_13-40-52.mp4 but this file is 0KB

Maybe some permission issue?

@louis030195

can you try to run this maybe

https://github.com/nashaofu/xcap/blob/master/examples/window.rs

chandeldivyam commented 3 weeks ago

Ran the terminal as administrator, didn't get the error.

can you try to run this maybe

https://github.com/nashaofu/xcap/blob/master/examples/window.rs

I think this should work because we are getting screen captures. As we can are getting the ocr_text table filled. But let me check.

louis030195 commented 3 weeks ago

so first source of leaks: https://github.com/RustAudio/cpal/pull/894/files

second is xcap:

i suggest we suggest we switch to scap even if we lose the app name and window name feature for now (which we can implement within 1-2 d i bet) so fix memory issues first

louis030195 commented 3 weeks ago

https://github.com/mediar-ai/screenpipe/blob/83c9b79413a7bdc790ab95080d211e0fb5b7093e/screenpipe-vision/src/capture_screenshot_by_window.rs#L49

https://github.com/nashaofu/xcap/blob/daff24c6a50bf1fc38f6f9974ced71d8d48004c5/src/macos/impl_window.rs#L177

https://github.com/servo/core-foundation-rs/blob/b2fdaf4132a8fff7aadd6885ec7c570296a664ea/core-graphics/src/display.rs#L917

louis030195 commented 3 weeks ago

trying a hack now to fix with xcap

louis030195 commented 3 weeks ago

update: any help to switch to this: https://github.com/mediar-ai/scap would be good

we need to impl: app_name, window_name and these

https://github.com/CapSoftware/scap/issues/114

https://github.com/CapSoftware/scap/issues/113

https://github.com/CapSoftware/scap/issues/111

chandeldivyam commented 3 weeks ago

Got an issue, which was making my RAM go crazy. After changes, recording the screen for the last 40 minutes and ram didn't move an inch.

We are never popping `frame_queue` 

So it grows till the max size and takes up all the memory.

screenpipe-server\src\video.rs

let frame_queue = Arc::new(ArrayQueue::new(MAX_QUEUE_SIZE)); We initialize it, but this never gets popped in our codebase. So, keeps increasing the memory till we reach MAX_QUEUE_SIZE, but its too late as MAX_QUEUE_SIZE = 100

Till the time it goes there, my system basically crashed. For around len of 10, it was 5GB, so it needed 50GB to reach 100.

Changed it to max = 10 and have been running the recording at --fps = 1, its been going on for 50 minutes and no memory increase.

Checked everywhere from my understanding, ocr_frame_queue and video_frame_queue are being consumed, but frame_queue is dead code. [I could be wrong here, why have we used it I am not sure about, I would love to know]

We should do two things now ->

Immediate fix should be that we reduce MAX_QUEUE_SIZE -> something smaller like 10
Refactor the code to remove frame_queue as it is not needed.

@louis030195 @m13v

chandeldivyam commented 3 weeks ago

[2024-08-23T19:50:31Z INFO  screenpipe_server::video] Starting FFmpeg process for file: C:\Users\ABC\.screenpipe\data\2024-08-23_19-50-31.mp4
[2024-08-23T19:50:31Z INFO  screenpipe_server::resource_monitor] Runtime: 3050s, Total Memory: 1% (0.30 GB / 23.94 GB), Total CPU: 55%
[2024-08-23T19:50:51Z INFO  screenpipe_server::resource_monitor] Runtime: 3070s, Total Memory: 1% (0.19 GB / 23.94 GB), Total CPU: 50%
[2024-08-23T19:51:01Z INFO  screenpipe_server::resource_monitor] Runtime: 3080s, Total Memory: 1% (0.30 GB / 23.94 GB), Total CPU: 46%
[2024-08-23T19:51:11Z INFO  screenpipe_server::resource_monitor] Runtime: 3090s, Total Memory: 2% (0.50 GB / 23.94 GB), Total CPU: 49%
[2024-08-23T19:51:21Z INFO  screenpipe_server::resource_monitor] Runtime: 3100s, Total Memory: 1% (0.20 GB / 23.94 GB), Total CPU: 55%

m13v commented 3 weeks ago

Perfect, create a pull request, seems like you've solved it! Congrats!

On Fri, Aug 23, 2024 at 12:52 PM Divyam Chandel @.***> wrote:

[2024-08-23T19:50:31Z INFO screenpipe_server::video] Starting FFmpeg process for file: C:\Users\ABC.screenpipe\data\2024-08-23_19-50-31.mp4 [2024-08-23T19:50:31Z INFO screenpipe_server::resource_monitor] Runtime: 3050s, Total Memory: 1% (0.30 GB / 23.94 GB), Total CPU: 55% [2024-08-23T19:50:51Z INFO screenpipe_server::resource_monitor] Runtime: 3070s, Total Memory: 1% (0.19 GB / 23.94 GB), Total CPU: 50% [2024-08-23T19:51:01Z INFO screenpipe_server::resource_monitor] Runtime: 3080s, Total Memory: 1% (0.30 GB / 23.94 GB), Total CPU: 46% [2024-08-23T19:51:11Z INFO screenpipe_server::resource_monitor] Runtime: 3090s, Total Memory: 2% (0.50 GB / 23.94 GB), Total CPU: 49% [2024-08-23T19:51:21Z INFO screenpipe_server::resource_monitor] Runtime: 3100s, Total Memory: 1% (0.20 GB / 23.94 GB), Total CPU: 55%

image.png (view on web) https://github.com/user-attachments/assets/8554b997-c834-45c9-94bb-f20898fabe06

— Reply to this email directly, view it on GitHub https://github.com/mediar-ai/screenpipe/issues/183#issuecomment-2307721226, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY62CDAPCAJIAS6FGZ2LBDDZS6HG5AVCNFSM6AAAAABMWRKIQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXG4ZDCMRSGY . You are receiving this because you were mentioned.Message ID: @.***>

chandeldivyam commented 3 weeks ago

Perfect, create a pull request, seems like you've solved it! Congrats!

210

@m13v

louis030195 commented 3 weeks ago

@chandeldivyam good job, actually this was my bad when trying to solve this issue by using data structures that don't grow infinitely (e.g. VecDeque -> ArrayQueue and avoid anti patterns like Mutex) but seems like the if was incorrect

i released new version with this fix in app & brew now

this does not solve the original issue (my memory still growing forever) though which is essentially xcap and screencapture leaking memory due to objects not being released in unsafe code blocks

https://github.com/RustAudio/cpal/pull/894/files#diff-8217ad4e41585a42aa590f64ebde590c3706d97f7b658632ec80e46196dde67eR13

https://github.com/nashaofu/xcap/blob/daff24c6a50bf1fc38f6f9974ced71d8d48004c5/src/macos/impl_window.rs#L182

i suggest we focus on switching to scap now which consists in:

we need to impl: app_name, window_name and these

CapSoftware/scap#114

CapSoftware/scap#113

CapSoftware/scap#111

regarding macos audio output the leak seems acceptable for now because we don't call so frequently the function (once every 30s, while the Windows:all() is probably called >5 times every frame per monitor)

chandeldivyam commented 3 weeks ago

@louis030195 Yes, we should also first create a mechanism to benchmark. I feel this because xcap issue must be with screencapturekit.

Because I ran screenpipe (on windows) without audio for hours, the memory didn't move. It was same / lower than where it started in first 30 seconds.

chandeldivyam commented 3 weeks ago

Maybe just something like a plot against time while video capture? Current reporting benchmark itself (which we see in the terminal) plotted over time? This could potentially help us understand if something is changing.

louis030195 commented 3 weeks ago

@louis030195 Yes, we should also first create a mechanism to benchmark. I feel this because xcap issue must be with screencapturekit.

Because I ran screenpipe (on windows) without audio for hours, the memory didn't move. It was same / lower than where it started in first 30 seconds.

xcap does not use screencaptuekit (mac) they use old apple api

screencapturekit is the new api for mac used in scap

using scap would also solve #63 (about 3-4 linux users cannot use screenpipe because of this) scap is also 21x time faster than xcap on mac capture (tested)

we log to files resource usage in here: https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-server/src/resource_monitor.rs

you need to add SAVE_RESOURCE_USAGE=true in env var before running cli

and i did a google colab to create charts out of this data: https://colab.research.google.com/drive/1zELlGdzGdjChWKikSqZTHekm5XRxY-1r?usp=sharing

in the past i was working in observability team to track billion of devices performance with promotheus + grafana but i dont think this is good for consumer things, we just have to write metrics ourselves

ideally we should use this well (although this is more for logging): https://github.com/tokio-rs/tracing

i think we should take inspiration on how they use it:

https://github.com/search?q=repo:huggingface/candle%20span&type=code

chandeldivyam commented 3 weeks ago

Great, I think we are well researched then, that there is tangible benefit of moving from xcap -> scap

Let me look into both xcap and scap and try to migrate us to scap.

Things I am wondering right now:

What are the impl which xcap and scap provides and how different they are
- I did see the issues you have created for visible/focus window and app name.
How are we consuming them, and what would we need to change in screenpipe when we change it

Let me look into this section and will update.

louis030195 commented 3 weeks ago

@chandeldivyam good!

i will focus on "visible/focus window and app name" for macos now

and push here https://github.com/mediar-ai/scap

once this is usable on macos/linux/windows lets replace xcap in screenpipe

chandeldivyam commented 3 weeks ago

Perfect

chandeldivyam commented 3 weeks ago

I went through the screenpipe code as well.

We should ideally get the response from scap and transform it, in a back compatible format.

screenpipe-vision\src\monitor.rs screenpipe-vision\src\capture_screenshot_by_window.rs

The functions capturing and validating the images would change and we can transform the new output by scap into our impl, that is something which I was thinking.

louis030195 commented 3 weeks ago

this was not the issue

louis030195 commented 3 weeks ago

the main issue is solved by upgrading to xcap to latest version for macos which includes a fix of the memory leak

the bounty is still live to solve the 2nd memory leak here:

https://github.com/louis030195/cpal-d

this is less a problem because we call infrequently so would cause memory issue only after having ran screenpipe for days

louis030195 commented 3 weeks ago

increasing the bounty

/bounty 150

e.g. make the https://github.com/louis030195/cpal-d not leaking

louis030195 commented 3 weeks ago

236 moving issue here (bounty 150)

mediar-ai / screenpipe

CPU Utilization > 100% and Memory Utilization > 10GB #183

💎 $150 bounty • Screenpi.pe

Steps to solve:

210

236 moving issue here (bounty 150)