Closed louis030195 closed 2 months ago
/attempt #278
with your implementation plan/claim #278
in the PR body to claim the bountyThank you for contributing to mediar-ai/screenpipe!
Hi, I'd be interested in giving this a shot if you could give me instructions on how exactly to run this to trigger the problem.
I am (mostly) free ATM, so I would not mind fixing this leak too.
@louis030195
Could you provide more details, like the leaks
logs included last time?
@FractalFir last time?
this is current process leaks
(after 27 min, uses 8 gb):
https://gist.github.com/louis030195/41914b36910efcbf9cb96e96714eee68
but i think the leaks command or UI is not helpful anymore, this is just 7mb leak
that's why looking for other way to profile, are you on linux? i heard about this https://github.com/flamegraph-rs/flamegraph
but does not work on mac
to build CLI on linux:
sudo apt-get update
sudo apt-get install -y libavformat-dev libavfilter-dev libavdevice-dev ffmpeg libasound2-dev tesseract-ocr libtesseract-dev
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
git clone https://github.com/mediar-ai/screenpipe
cd screenpipe
cargo build --release
. /target/release/screenpipe
screencapture does not work on Wayland fyi
you can profile heap memory usage with jemalloc_pprof on Linux.
Apply this diff:
diff --git a/screenpipe-server/Cargo.toml b/screenpipe-server/Cargo.toml
index 169b99c..542e4b1 100644
--- a/screenpipe-server/Cargo.toml
+++ b/screenpipe-server/Cargo.toml
@@ -75,6 +75,10 @@ async-trait = "0.1.68"
ndarray = "0.15.6"
rust-stemmers = "1.2.0"
+tikv-jemallocator = { version = "0.5.0", features = ["profiling", "unprefixed_malloc_on_supported_platforms"] }
+jemalloc_pprof = "0.4.2"
+
+
[dev-dependencies]
tempfile = "3.3.0"
diff --git a/screenpipe-server/src/bin/screenpipe-server.rs b/screenpipe-server/src/bin/screenpipe-server.rs
index 0cb2b5e..d2aed29 100644
--- a/screenpipe-server/src/bin/screenpipe-server.rs
+++ b/screenpipe-server/src/bin/screenpipe-server.rs
@@ -69,8 +69,47 @@ fn get_base_dir(custom_path: Option<String>) -> anyhow::Result<PathBuf> {
Ok(base_dir)
}
+#[cfg(not(target_env = "msvc"))]
+#[global_allocator]
+static ALLOC: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
+
+#[allow(non_upper_case_globals)]
+#[export_name = "malloc_conf"]
+pub static malloc_conf: &[u8] = b"prof:true,prof_active:true,lg_prof_sample:19\0";
+
+use axum::http::StatusCode;
+use axum::response::IntoResponse;
+
+pub async fn handle_get_heap() -> Result<impl IntoResponse, (StatusCode, String)> {
+ let mut prof_ctl = jemalloc_pprof::PROF_CTL.as_ref().unwrap().lock().await;
+ require_profiling_activated(&prof_ctl)?;
+ let pprof = prof_ctl
+ .dump_pprof()
+ .map_err(|err| (StatusCode::INTERNAL_SERVER_ERROR, err.to_string()))?;
+ Ok(pprof)
+}
+
+/// Checks whether jemalloc profiling is activated an returns an error response if not.
+fn require_profiling_activated(prof_ctl: &jemalloc_pprof::JemallocProfCtl) -> Result<(), (StatusCode, String)> {
+ if prof_ctl.activated() {
+ Ok(())
+ } else {
+ Err((axum::http::StatusCode::FORBIDDEN, "heap profiling not activated".into()))
+ }
+}
+
#[tokio::main]
async fn main() -> anyhow::Result<()> {
+ let app = axum::Router::new()
+ .route("/debug/pprof/heap", axum::routing::get(handle_get_heap));
+
+ // run our app with hyper, listening globally on port 3000
+ let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
+
+ tokio::spawn(async {
+ axum::serve(listener, app).await.unwrap();
+ });
+
let cli = Cli::parse();
if find_ffmpeg_path().is_none() {
install pprof (assuming you have golang installed):
go install github.com/google/pprof@latest
get a heap pprof and analyze it with the pprof tool:
curl http://localhost:3000/debug/pprof/heap > out.pprof && ~/go/bin/pprof -http : out.pprof
(I suggest navigating to "flamegraph" in the pprof UI)
last time?
I meant like when I worked on fixing the leak in screencapturekit-rs.
Yes, I am on Linux, currently downloading and building the CLI. I have used things like cargo-flamegraph quite a bit before, because I was dealing with high memory usage in my own projects.
Well, the output still tells us a few things. None of the leaks are > 64 bytes, which suggest that the leaked object is small. It is unlikely to be a video frame / audio sample.
@louis030195
screencapture does not work on Wayland fyi
what is "screencapture" ?
And does this mean it is impossible to reproduce the leak on wayland?
@louis030195
screencapture does not work on Wayland fyi
what is "screencapture" ?
And does this mean it is impossible to reproduce the leak on wayland?
screenpipe take screenshots of all your windows of all your monitors continuously and do OCR + mp4 encoding to disk
it also record audio continuously and do STT + mp4 encoding
and it mean you cannot reproduce on wayland the vision leaks, you can reproduce audio leaks though (might break down the bounty in smaller ones if there is a leak in both audio and vision)
you can disable audio or vision using --disable-vision
or --disable-audio
I'm repeatedly getting this error when running (in X):
[2024-09-05T18:31:41Z ERROR screenpipe_server::video] Failed to write frame to ffmpeg: Broken pipe (os error 32)
I'm repeatedly getting this error when running (in X):
[2024-09-05T18:31:41Z ERROR screenpipe_server::video] Failed to write frame to ffmpeg: Broken pipe (os error 32)
hmm
this is another issue that nobody found how to reproduce actually https://github.com/mediar-ai/screenpipe/issues/228
I think can replicate the leak on my machine, and it seems a bit bigger on Linux.
[2024-09-05T18:43:49Z INFO screenpipe_server::resource_monitor] Runtime: 310s, Total Memory: 21% (3.33 GB / 15.72 GB), Total CPU: 17%
[2024-09-05T18:44:19Z INFO screenpipe_server::resource_monitor] Runtime: 340s, Total Memory: 23% (3.65 GB / 15.72 GB), Total CPU: 617%
300 Mb in 30 seconds is quite a lot. I will be analysing the exact cause.
It looks like memory usage goes up in very sudden bursts of allocations.
keep in mind we load a whisper-large model in memory at boot (nvidia if using cuda feature and apple stuff when using metal feature or RAM+CPU) for audio transcription
also leaks show big leak at boot only every time but i cannot see the full stack for some reason in the UI and does not show in the CLI: https://github.com/huggingface/candle/issues/2271#issuecomment-2323516825
seems correlated to model loading but not sure
Yeah, I will let it run for a bit longer to have more accurate data. I thought 1 minute would be enough to initialize everything, but giving it more time will not hurt.
(will keep updating this msg w perf logs) atm trying different setup myself:
11:47 am
pid 57490 - alacritty - ./target/release/screenpipe --fps 0.2 --audio-transcription-engine whisper-large --audio-device "MacBook Pro Microphone (input)" --data-dir /tmp/sp --ocr-engine apple-native --port 3038
pid 57424 - app - /Applications/screenpipe.app/Contents/MacOS/screenpipe --port 3030 --fps 0.2 --audio-transcription-engine whisper-large --ocr-engine apple-native --audio-device "MacBook Pro Microphone (input)"
pid 57647 - cursor - cargo run --bin screenpipe -- --disable-audio --fps 0.2 --ocr-engine apple-native --port 3031 --data-dir /tmp/spp
at 7m:
30m
1h40m
ofc running parallel stuff add more noise
I will let in run for some more time to get a better picture of what is happening.
This is quite a weird issue.
I have run the executable under heaptrack to see the exact cause of the leak.
I think the memory is leaking, but according to heaptrack
, the memory usage seems to stay the same.
Heaptrack also thinks that the peak memory usage was 3.7 GB(or 4.6 including heaptrack overhead). However, this is not the case according to the memory usage metrics, which claim a higher usage.
Runtime: 491s, Total Memory: 30% (4.71 GB / 15.72 GB), Total CPU: 787%
So, it seems like heaptrack
will not be enough, and I will try using valgrind. It is much slower, but should give more accurate info.
I have run the program under valgrind for some time, and have some initial results.
==147072== LEAK SUMMARY:
==147072== definitely lost: 7,800 bytes in 110 blocks
==147072== indirectly lost: 11,223 bytes in 60 blocks
==147072== possibly lost: 12,615,327 bytes in 165,968 blocks
==147072== still reachable: 3,139,704,411 bytes in 14,658 blocks
==147072== of which reachable via heuristic:
==147072== length64 : 292,104 bytes in 1,346 blocks
==147072== suppressed: 332 bytes in 2 blocks
==147072==
==147072== For lists of detected and suppressed errors, rerun with: -s
==147072== ERROR SUMMARY: 595 errors from 595 contexts (suppressed: 2 from 2)
The still reachable
blocks is memory which is still accessible to the program, but was not freed when I stopped it (for example, the whisper model).
Directly and indirectly, lost blocks are pieces of memory valgrind knows can't be freed. However, those were allocated in C code of some Linux audio utilities, and should not be the cause of the problem.
The 12.6 MB of "possibly lost" memory kind of looks like it could be the leak, but I am not sure.
The thing about "possibly lost" blocks is that they could be still reachable, so false positives are not out of the question.
Some things seem to suggest that at least some of the leaks you have observed are included in that "possibly lost" memory.
You have said that you think you might have a leak related to model loading. This to me looks like it could be that leak:
==147072== 19,660,800 bytes in 1 blocks are still reachable in loss record 3,096 of 3,110
==147072== at 0x5758866: malloc (vg_replace_malloc.c:446)
==147072== by 0x4A407B7: UnknownInlinedFun (alloc.rs:98)
==147072== by 0x4A407B7: UnknownInlinedFun (alloc.rs:181)
==147072== by 0x4A407B7: UnknownInlinedFun (alloc.rs:241)
==147072== by 0x4A407B7: UnknownInlinedFun (raw_vec.rs:478)
==147072== by 0x4A407B7: with_capacity_in<alloc::alloc::Global> (raw_vec.rs:425)
==147072== by 0x4A407B7: with_capacity_in<f32, alloc::alloc::Global> (raw_vec.rs:202)
==147072== by 0x4A407B7: with_capacity_in<f32, alloc::alloc::Global> (mod.rs:698)
==147072== by 0x4A407B7: with_capacity<f32> (mod.rs:480)
==147072== by 0x4A407B7: from_iter<f32, core::iter::adapters::map::Map<core::slice::iter::Iter<half::binary16::f16>, candle_core::cpu_backend::utils::unary_map::{closure_env#0}<half::binary16::f16, f32, candle_core::cpu_backend::{impl#27}::to_dtype::{closure_env#14}>>> (spec_from_iter_nested.rs:52)
==147072== by 0x4A407B7: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter (spec_from_iter.rs:33)
==147072== by 0x4AB478A: from_iter<f32, core::iter::adapters::map::Map<core::slice::iter::Iter<half::binary16::f16>, candle_core::cpu_backend::utils::unary_map::{closure_env#0}<half::binary16::f16, f32, candle_core::cpu_backend::{impl#27}::to_dtype::{closure_env#14}>>> (mod.rs:2986)
==147072== by 0x4AB478A: collect<core::iter::adapters::map::Map<core::slice::iter::Iter<half::binary16::f16>, candle_core::cpu_backend::utils::unary_map::{closure_env#0}<half::binary16::f16, f32, candle_core::cpu_backend::{impl#27}::to_dtype::{closure_env#14}>>, alloc::vec::Vec<f32, alloc::alloc::Global>> (iterator.rs:2000)
==147072== by 0x4AB478A: candle_core::cpu_backend::utils::unary_map (utils.rs:285)
==147072== by 0x4A30394: <candle_core::cpu_backend::CpuStorage as candle_core::backend::BackendStorage>::to_dtype (mod.rs:1721)
==147072== by 0x4AC2B29: UnknownInlinedFun (storage.rs:182)
==147072== by 0x4AC2B29: candle_core::tensor::Tensor::to_dtype (tensor.rs:2019)
==147072== by 0x4A10D27: <candle_core::safetensors::MmapedSafetensors as candle_nn::var_builder::SimpleBackend>::get (var_builder.rs:382)
==147072== by 0x4A0F67C: <alloc::boxed::Box<dyn candle_nn::var_builder::SimpleBackend> as candle_nn::var_builder::Backend>::get (var_builder.rs:86)
==147072== by 0x490742A: get_with_hints_dtype<alloc::boxed::Box<dyn candle_nn::var_builder::SimpleBackend, alloc::alloc::Global>, (usize, usize, usize)> (var_builder.rs:198)
==147072== by 0x490742A: get_with_hints<alloc::boxed::Box<dyn candle_nn::var_builder::SimpleBackend, alloc::alloc::Global>, (usize, usize, usize)> (var_builder.rs:181)
==147072== by 0x490742A: candle_nn::var_builder::VarBuilderArgs<B>::get (var_builder.rs:186)
==147072== by 0x48F6C3D: candle_transformers::models::whisper::model::conv1d (model.rs:13)
==147072== by 0x48FC8B9: load (model.rs:259)
==147072== by 0x48FC8B9: candle_transformers::models::whisper::model::Whisper::load (model.rs:382)
==147072== by 0x268B5F9: screenpipe_audio::stt::WhisperModel::new (stt.rs:77)
==147072== by 0x1E741DD: {async_fn#0} (stt.rs:731)
==147072== by 0x1E741DD: {async_fn#0} (core.rs:51)
==147072== by 0x1E741DD: screenpipe::main::{{closure}}::{{closure}}::{{closure}} (screenpipe-server.rs:327)
==147072==
==147072== 26,214,400 bytes in 1 blocks are still reachable in loss record 3,097 of 3,110
==147072== at 0x5758866: malloc (vg_replace_malloc.c:446)
==147072== by 0x4A407B7: UnknownInlinedFun (alloc.rs:98)
==147072== by 0x4A407B7: UnknownInlinedFun (alloc.rs:181)
==147072== by 0x4A407B7: UnknownInlinedFun (alloc.rs:241)
==147072== by 0x4A407B7: UnknownInlinedFun (raw_vec.rs:478)
==147072== by 0x4A407B7: with_capacity_in<alloc::alloc::Global> (raw_vec.rs:425)
==147072== by 0x4A407B7: with_capacity_in<f32, alloc::alloc::Global> (raw_vec.rs:202)
==147072== by 0x4A407B7: with_capacity_in<f32, alloc::alloc::Global> (mod.rs:698)
==147072== by 0x4A407B7: with_capacity<f32> (mod.rs:480)
==147072== by 0x4A407B7: from_iter<f32, core::iter::adapters::map::Map<core::slice::iter::Iter<half::binary16::f16>, candle_core::cpu_backend::utils::unary_map::{closure_env#0}<half::binary16::f16, f32, candle_core::cpu_backend::{impl#27}::to_dtype::{closure_env#14}>>> (spec_from_iter_nested.rs:52)
==147072== by 0x4A407B7: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter (spec_from_iter.rs:33)
==147072== by 0x4AB478A: from_iter<f32, core::iter::adapters::map::Map<core::slice::iter::Iter<half::binary16::f16>, candle_core::cpu_backend::utils::unary_map::{closure_env#0}<half::binary16::f16, f32, candle_core::cpu_backend::{impl#27}::to_dtype::{closure_env#14}>>> (mod.rs:2986)
==147072== by 0x4AB478A: collect<core::iter::adapters::map::Map<core::slice::iter::Iter<half::binary16::f16>, candle_core::cpu_backend::utils::unary_map::{closure_env#0}<half::binary16::f16, f32, candle_core::cpu_backend::{impl#27}::to_dtype::{closure_env#14}>>, alloc::vec::Vec<f32, alloc::alloc::Global>> (iterator.rs:2000)
==147072== by 0x4AB478A: candle_core::cpu_backend::utils::unary_map (utils.rs:285)
==147072== by 0x4A30394: <candle_core::cpu_backend::CpuStorage as candle_core::backend::BackendStorage>::to_dtype (mod.rs:1721)
==147072== by 0x4AC2B29: UnknownInlinedFun (storage.rs:182)
==147072== by 0x4AC2B29: candle_core::tensor::Tensor::to_dtype (tensor.rs:2019)
==147072== by 0x4A10D27: <candle_core::safetensors::MmapedSafetensors as candle_nn::var_builder::SimpleBackend>::get (var_builder.rs:382)
==147072== by 0x4A10B7C: UnknownInlinedFun (var_builder.rs:86)
==147072== by 0x4A10B7C: candle_nn::var_builder::VarBuilderArgs<B>::get_with_hints_dtype (var_builder.rs:198)
==147072== by 0x4A1058A: get_with_hints<alloc::boxed::Box<dyn candle_nn::var_builder::SimpleBackend, alloc::alloc::Global>, (usize, usize)> (var_builder.rs:181)
==147072== by 0x4A1058A: candle_nn::linear::linear (linear.rs:62)
==147072== by 0x4906469: candle_transformers::models::with_tracing::linear (with_tracing.rs:57)
==147072== by 0x48F9B04: candle_transformers::models::whisper::model::ResidualAttentionBlock::load (model.rs:163)
==147072== by 0x4905071: {closure#0} (model.rs:263)
==147072== by 0x4905071: {closure#0}<usize, core::result::Result<candle_transformers::models::whisper::model::ResidualAttentionBlock, candle_core::error::Error>, (), core::ops::control_flow::ControlFlow<core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>, ()>, candle_transformers::models::whisper::model::{impl#2}::load::{closure_env#0}, core::iter::adapters::{impl#0}::try_fold::{closure_env#0}<core::iter::adapters::map::Map<core::ops::range::Range<usize>, candle_transformers::models::whisper::model::{impl#2}::load::{closure_env#0}>, core::result::Result<core::convert::Infallible, candle_core::error::Error>, (), core::iter::traits::iterator::Iterator::try_for_each::call::{closure_env#0}<candle_transformers::models::whisper::model::ResidualAttentionBlock, core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>, fn(candle_transformers::models::whisper::model::ResidualAttentionBlock) -> core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>>, core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>>> (map.rs:95)
==147072== by 0x4905071: try_fold<core::ops::range::Range<usize>, (), core::iter::adapters::map::map_try_fold::{closure_env#0}<usize, core::result::Result<candle_transformers::models::whisper::model::ResidualAttentionBlock, candle_core::error::Error>, (), core::ops::control_flow::ControlFlow<core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>, ()>, candle_transformers::models::whisper::model::{impl#2}::load::{closure_env#0}, core::iter::adapters::{impl#0}::try_fold::{closure_env#0}<core::iter::adapters::map::Map<core::ops::range::Range<usize>, candle_transformers::models::whisper::model::{impl#2}::load::{closure_env#0}>, core::result::Result<core::convert::Infallible, candle_core::error::Error>, (), core::iter::traits::iterator::Iterator::try_for_each::call::{closure_env#0}<candle_transformers::models::whisper::model::ResidualAttentionBlock, core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>, fn(candle_transformers::models::whisper::model::ResidualAttentionBlock) -> core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>>, core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>>>, core::ops::control_flow::ControlFlow<core::ops::control_flow::ControlFlow<candle_transformers::models::whisper::model::ResidualAttentionBlock, ()>, ()>> (iterator.rs:2405)
==147072== by 0x4905071: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold (map.rs:121)
==147072== by 0x48EDFE9: UnknownInlinedFun (mod.rs:191)
==147072== by 0x48EDFE9: UnknownInlinedFun (iterator.rs:2467)
==147072== by 0x48EDFE9: UnknownInlinedFun (mod.rs:174)
==147072== by 0x48EDFE9: from_iter<candle_transformers::models::whisper::model::ResidualAttentionBlock, core::iter::adapters::GenericShunt<core::iter::adapters::map::Map<core::ops::range::Range<usize>, candle_transformers::models::whisper::model::{impl#2}::load::{closure_env#0}>, core::result::Result<core::convert::Infallible, candle_core::error::Error>>> (spec_from_iter_nested.rs:24)
==147072== by 0x48EDFE9: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter (spec_from_iter.rs:33)
(the bytes in the log are the total count, not the count in that leak).
However, once again, those are possible leaks, and not "guranteed leaks". Also, I am not sure if the leak seen on MacOS also present on Linux.
Could you try running the program under valgrind yourself? I just want to make sure the issue is present on both platforms.
EDIT: it looks like valgrind is not supported on ARM Macs. :(. I guess we will need to use something different.
Just to spare someone else the effort: I ran it under bytehound, and the results are similar to heaptrack. It claims the memory is lower than it actually is (according to htop) and does not even see the ever-increasing memory usage.
Just to spare someone else the effort: I ran it under bytehound, and the results are similar to heaptrack. It claims the memory is lower than it actually is (according to htop) and does not even see the ever-increasing memory usage.
interesting
actually our resource monitor never properly recorded memory also https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-server/src/resource_monitor.rs
i need to go away for max 1h, will come back on this issue after (~2 pm here)
also you can run these other binaries to run smaller parts btw:
cargo build --release
./target/release/screenpipe # (end to end vision, audio, db, api)
./target/release/screenpipe-vision # just record vision + ocr (does not save files)
./target/release/screenpipe-audio-forever # just record audio + stt (save files)
# or through cargo run:
cargo run --bin screenpipe
# etc.
i need to go away for max 1h, will come back on this issue after (~2 pm here)
Understandable, I too will have to go away in a few hours.
I have found a way to make the leak faster, and more visible.
With those settings, the memory usage seems to grow from 0.67 GB at the start to 2GB after ~5.5 minutes.
[2024-09-05T21:09:16Z INFO screenpipe_server::resource_monitor] Runtime: 320s, Total Memory: 13% (2.02 GB / 15.72 GB), Total CPU: 688%
this seems to suggest this is issue is related to video recording. However, this could also be a false positive, since video recording is resource intensive in general.
11:47 am
57490 - ./target/release/screenpipe --fps 0.2 --audio-transcription-engine whisper-large --audio-device "MacBook Pro Microphone (input)" --data-dir /tmp/sp --ocr-engine apple-native --port 3038
57424 - /Applications/screenpipe.app/Contents/MacOS/screenpipe --port 3030 --fps 0.2 --audio-transcription-engine whisper-large --ocr-engine apple-native --audio-device "MacBook Pro Microphone (input)"
57647 - cargo run --bin screenpipe -- --disable-audio --fps 0.2 --ocr-engine apple-native --port 3031 --data-dir /tmp/spp
1.29 pm
87637 - cargo run --bin screenpipe -- --disable-audio --port 3031 --data-dir /tmp/spp
2.03 pm
98906 - cargo run --bin screenpipe-vision
2.08 pm
1808 - cargo run --features metal --bin screenpipe-audio-forever -- --audio-device "MacBook Pro Microphone (input)"
stopped all at 2.55 pm:
easy way to reproduce:
cargo run --bin screenpipe-vision -- --fps 30
quickly grow, despite no vision, server or db codecargo run --bin screenpipe-audio-forever -- --list-audio-devices
cargo run --bin screenpipe-audio-forever -- --audio-device "your audio device" --audio-chunk-duration 1
This does not seem to work on my machine, when I run
cargo run --bin screenpipe-vision -- --fps 30
I get:
error: unexpected argument '--fps' found
Usage: screenpipe-vision [OPTIONS]
For more information, try '--help'.
When I run cargo run --bin screenpipe-vision --help
I get:
Usage: screenpipe-vision [OPTIONS]
Options:
--save-text-files Save text files
--cloud-ocr-off Disable cloud OCR processing
-h, --help Print help
-V, --version Print version
So, I am unable to run the vison code standalone.
Are you on some specific branch?
git pull (just updated)
I think there might be more than one or a more fundamental issue here. I'm currently looking at screenpipe-audio-forever, which has a bug causing ffmpeg to fail immediately after the second recording.
This still grows.
fixed screenpipe-audio-forever
(try 5 for duration maybe instead) (git pull
)
Also at least on linux/ubuntu with pipewire as an alsa backend, each time we list the devices, it seems to leak a bit inside pipewire. This particular leak can be seen in bytehound because there is a lot of libdbus allocations still hanging around and also the alsa/pcm allocations keep growing:
Both of these seem outside our control. My initial suspicion was that we are hanging onto the device handles, but that seems not to be the case because i can see the "drop" happening in my debugger.
I'll look at this more tomorrow, it's midnight here. Good luck π
It is also midnight for me, so I will too be soon heading to bed.
BTW: I can't reproduce the vision leak on Linux. The memory grows for some time, but then it stabilizes.
Question: could you provide the output of the leaks command for just the vision module?
export MallocStackLogging=1
leaks cargo run --bin screenpipe-vision -- --fps 30
I don't know if the leaks
command tracks child processes, so while it might not have worked when running the whole project, it could work when running just the faulty component.
@FractalFir
i reached 12 gb after running screenpipe-vision
with 30 fps for 48 min
will share leaks, do you think it could be apple native OCR?
also i mostly heard of memory issues from mac users and less on windows and linux, but still some windows users found using too much memory/cpu sometimes (esp when only 16 gb ram computers or no GPU)
this is the code:
https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/src/apple.rs
https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/src/ocr.swift
leaks: https://gist.github.com/louis030195/92717aaedfde57e592bb424567aeeeb6
(note that i did a few changes right now in core.rs vision code before running leaks command which could have improved perf, just see overuse of arc and clones in the vision code that is not necessary)
The leaks
output is sadly not very helpful (since it seems to be mostly empty).
will share leaks, do you think it could be apple native OCR?
Well, there is a way to check if this is caused by Apple native OCR.
Is the leak still present when you switch to a different OCR engine?
E.g. when you run with --ocr-engine unstructured
?
If the leak is present with a different OCR engine, then the issue must be somewhere else. If it disappears after changing engines, then this must be related to Apple OCR.
running screenpipe-vision with tesseract with 120 fps right now to see
well
with tesseract screenpipe uses less than 50 mb while apple using 4 gb
looking into the swift code now
trying some changes on the swift/rs code related to apple ocr
(does not seem to be the issue)
Does just calling this function in a loop leak memory?
If so, then we know that the issue is in that function and that function alone.
If the issue is there, can you replicate it in swift? For example, by just passing some hardcoded image?
OK, so it looks like the leak is there.
There must be some kind of bug in the swift code, so the next logical step would be looking closer at that code to find the exact cause.
I am not a swift expert, but I would suggest disabling certain parts of the swift code until the leak disappears.
For example, you could check if this swift code alone:
guard let dataProvider = CGDataProvider(data: Data(bytes: imageData, count: length) as CFData),
let cgImage = CGImage(
width: width,
height: height,
bitsPerComponent: 8,
bitsPerPixel: 32,
bytesPerRow: width * 4,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue),
provider: dataProvider,
decode: nil,
shouldInterpolate: false,
intent: .defaultIntent
)
else {
return strdup("Error: Failed to create CGImage")
}
// Preprocess the image
let ciImage = CIImage(cgImage: cgImage)
let context = CIContext(options: nil)
// Apply preprocessing filters (slightly reduced contrast compared to original)
let processed = ciImage
.applyingFilter("CIColorControls", parameters: [kCIInputSaturationKey: 0, kCIInputContrastKey: 1.08])
.applyingFilter("CIUnsharpMask", parameters: [kCIInputRadiusKey: 0.8, kCIInputIntensityKey: 0.4])
guard let preprocessedCGImage = context.createCGImage(processed, from: processed.extent) else {
return strdup("Error: Failed to create preprocessed image")
}
var ocrResult = ""
var textElements: [[String: Any]] = []
var totalConfidence: Float = 0.0
var observationCount: Int = 0
// disable all code after this statement by returnign early.
return strdup("Is this enoguh to leak?")
leaks memory. If this first part leaks memory, then the issue is likely there. If calling this stub does nothing, then we know that the leak is somewhere further down the line. You can repeat this process until you find the exact cause of the leak.
Sadly, I have to go now. I will take a closer look at this tommorow.
hey everyone, i fixed the leak, doing few more test and will distribute the bounty shortly
/tip $150 @FractalFir /tip $50 @exi
thanks a lot π
feel free to have a look at other issues we do a bunch of bounties, also we did not have the opportunity to test much on linux unfortunately (still trying to setup a cloud desktop with audio and vision available)
ππ @FractalFir has been awarded $150! ππ
@exi: You just got a $50 tip! π Complete your Algora onboarding to collect your payment.
π @louis030195: Navigate to your dashboard to proceed
ππ @exi has been awarded $50! ππ
how does screenpipe work?
previously noticed memory leaks in dependencies:
what is still to fix:
what i tried/did:
what could be helpful to try:
what i suspect is still leaking:
circular references with arc: overuse of arc without proper weak references can create reference cycles, preventing memory from being freed.
unbounded channels: using unbounded channels (e.g.,
mpsc::unbounded_channel()
) without proper backpressure can lead to memory growth if producers outpace consumers.long-running loops: continuous capture loops in vision and audio processing might accumulate data over time if not properly managed.
unmanaged file handles: repeatedly opening file handles for logging or data storage without proper closure could leak file descriptors.
spawned tasks not being cleaned up: tokio tasks that are spawned but not properly awaited or cancelled could lead to resource leaks.
large data structures in long-running processes: storing large amounts of data in memory for extended periods without proper cleanup.
improper error handling: failing to properly handle errors in async contexts might leave resources uncleaned.
caching without limits: implementing caches without size limits or eviction policies could lead to unbounded growth.
improper use of 'static lifetimes: overuse of 'static lifetimes might prevent data from being dropped when it's no longer needed.
resource-intensive callbacks: callbacks for audio or video processing that allocate memory without proper deallocation.
improper management of external resources: not properly releasing resources from external libraries or apis (e.g., ffmpeg, ocr engines).
accumulating historical data: storing historical data (e.g., previous images for comparison) without a retention policy.
inefficient string handling: repeated string allocations and concatenations in logging or data processing without reuse.
improper shutdown procedures: not properly shutting down all components and releasing resources when the application terminates.
memory fragmentation: frequent allocations and deallocations of varying sizes could lead to memory fragmentation, appearing as a "leak".
improper use of lazy_static or similar patterns: global state that grows over time without bounds.
inefficient use of buffers: repeatedly allocating new buffers for audio or video data instead of reusing existing ones.
improper handling of large files: loading large files entirely into memory instead of streaming or chunking.
unclosed streams: not properly closing audio or video streams, especially when dealing with multiple devices.
improper handling of device disconnections: not cleaning up resources when audio or video devices are disconnected unexpectedly.
wrong usage of ffmpeg maybe #194 would help
wrong usage of sqlite db
maybe using IPC for ffmpeg would help https://github.com/mediar-ai/screenpipe/issues/246
something else
context:
how to reproduce:
definition of done:
cc:
bounty $300
/bounty 300
happy to jump on a call if useful or for efficiency