Closed hakavlad closed 1 year ago
picocrypt rapidly allocated several gigabytes of memory, which led to Out Of Memory.
First of all, why would anyone ever need to use a 1GB+ keyfile when a 1KB keyfile is more than enough lol. Putting that aside though, the program should clear memory eventually after a few minutes of idling. Can you give me more details? How big is your keyfile, and how many times are you encrypting in a row to generate this issue?
On Sun, Jun 11, 2023, 12:11 p.m. Alexey Avramov @.***> wrote:
It seems that the program leaves the read key files entirely in memory, and does not release memory when the read data is no longer needed.
At the screenshot: picocrypt is using over 20 gigabytes of memory.
— Reply to this email directly, view it on GitHub https://github.com/HACKERALERT/Picocrypt/issues/165, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALUMDTCAUKB2OCTKXHMIRJLXKXU2VANCNFSM6AAAAAAZCNTXYI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
why would anyone ever need to use a 1GB+ keyfile
I want to be able to use files of any size.
How big is your keyfile
20.2 GiB
how many times
The problem was discovered immediately as soon as I started decrypting the file encrypted using a large key file and was easily reproducible every time.
Putting that aside though, the program should clear memory eventually after a few minutes of idling.
the program should never allocate several GB! File hashing can be done in small chunks.
Putting that aside though, the program should clear memory eventually after a few minutes of idling.
the program should never allocate several GB! File hashing can be done in small chunks.
Just... why? Even VeraCrypt and TrueCrypt hash only the last 1 MB of a file, if the file is larger than 1 MB. So hashing such a big file is useless anyways.
Ah okay. Yes, keyfiles are read into memory and hashed directly without reading it in chunks. I suppose this is somewhat of an oversight on my end because I didn't expect anyone to be using multi-GB keyfiles (which I think is a reasonable assumption). There's no benefit to a 20 GB keyfile over a 1 KB keyfile as long as there is enough entropy (which the built-in keyfile generator provides). There's not much I can do about this because any changes would likely break compatibility with existing volumes. I think I'll pass on this one.
any changes would likely break compatibility with existing volumes
Why would hashing a file in chunks instead of hashing the whole thing be incompatible?
Encryption does not cause such a memory leak. Are key files hashed differently throughout encryption and decryption?
UPD: encryption can also cause the problem.
Encryption does not cause such a memory leak. Are key files hashed differently throughout encryption and decryption?
Not quite sure what you mean, but encryption is done in 1 MB chunks whereas the keyfiles are read entirely into memory before being hashed. So for encryption, there isn't any "unnecessary" usage of memory but for keyfiles, there is. As long as you don't use large keyfiles, memory usage will be reasonable and eventually free itself.
Why would hashing a file in chunks instead of hashing the whole thing be incompatible?
Sorry, I was thinking about BigPanda's comment when writing this. Yeah, using only the first or last 1 MB will break compatibility for keyfiles larger than 1 MB. Hashing the keyfiles in chunks should be fine though. However, I still do not see much benefit and doubt that many people would be using such large keyfiles.
If you don't consider memory leaks a problem, users should at least be warned about the risks of memory leaks in the documentation.
A memory leak is when a program unintentionally uses memory without releasing it. In this case, nothing about the behaviour is unexpected per se, since the code dictates to read the entire keyfile into memory which is exactly what the program does. Perhaps it makes more sense to call it a poor design choice, though I would say it is a poor idea to be using such large keyfiles in the first place.
since the code dictates to read the entire keyfile into memory
What about #168? I saw picocrypt only read the first GiB, not the entire file.
In any case, the hashing of the files occurs in a sub-optimal way, and this can be corrected without breaking compatibility.
Let's move this over to #168 to keep things organized.
It seems that the program leaves the read key files entirely in memory, and does not release memory when the read data is no longer needed.
At the screenshot: picocrypt is using over 20 gigabytes of memory.