paolobarbolini / bzip2-rs

Pure Rust bzip2 decoder
Apache License 2.0
44 stars 6 forks source link

Possible memory Leak #11

Open dipstef opened 2 years ago

dipstef commented 2 years ago

Hi,

I used this library to process a bzip2 encoded Wikidata dump (70GB+) and was observing that memory was not being released.

Initially my suspicion was on the default BufferedReader lines iterator, however in the simple approach of reading lines reusing a mutable buffer the issue persisted.

A simple switch to this alternative: https://github.com/alexcrichton/bzip2-rs has solved the problem (memory peaks to a few MBs), hence I suspect the memory leak is in this library.

Cheers,

Stefano

paolobarbolini commented 2 years ago

Thanks for the report. Are you running version 0.1.2 from crates.io or on the latest revision from main?

Could you try running it via valgrind and see if it gives you any clues of what's happening?

dipstef commented 2 years ago

Hey Paolo,

I was running 0.1.2, I'll give it a shot with the latest revision.

I'd love to give it a try with valgrind however I am on MacOs Monterey and my basic understanding is that is not supported yet?

Cheers,

brawer commented 1 year ago

Has this been resolved? The original bug report mentioned Wikidata dumps; see wikidata-20221128-all.json.bz2 for an example.