Memory leak during decryption with keyfile

hakavlad commented 1 year ago

It seems that the program leaves the read key files entirely in memory, and does not release memory when the read data is no longer needed.

At the screenshot: picocrypt is using over 20 gigabytes of memory.

pico

hakavlad commented 1 year ago

picocrypt rapidly allocated several gigabytes of memory, which led to Out Of Memory.

pico

HACKERALERT commented 1 year ago

First of all, why would anyone ever need to use a 1GB+ keyfile when a 1KB keyfile is more than enough lol. Putting that aside though, the program should clear memory eventually after a few minutes of idling. Can you give me more details? How big is your keyfile, and how many times are you encrypting in a row to generate this issue?

On Sun, Jun 11, 2023, 12:11 p.m. Alexey Avramov @.***> wrote:

It seems that the program leaves the read key files entirely in memory, and does not release memory when the read data is no longer needed.

At the screenshot: picocrypt is using over 20 gigabytes of memory.

[image: pico] https://camo.githubusercontent.com/8544af36b7589f766ef5dadf7533c82c2dfef189680908d1e47c9b8143b1441f/68747470733a2f2f692e696d6775722e636f6d2f4b6b32594265762e706e67

— Reply to this email directly, view it on GitHub https://github.com/HACKERALERT/Picocrypt/issues/165, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALUMDTCAUKB2OCTKXHMIRJLXKXU2VANCNFSM6AAAAAAZCNTXYI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

hakavlad commented 1 year ago

why would anyone ever need to use a 1GB+ keyfile

I want to be able to use files of any size.

How big is your keyfile

20.2 GiB

how many times

The problem was discovered immediately as soon as I started decrypting the file encrypted using a large key file and was easily reproducible every time.

hakavlad commented 1 year ago

Putting that aside though, the program should clear memory eventually after a few minutes of idling.

the program should never allocate several GB! File hashing can be done in small chunks.

BigPanda97 commented 1 year ago

Putting that aside though, the program should clear memory eventually after a few minutes of idling.

the program should never allocate several GB! File hashing can be done in small chunks.

Just... why? Even VeraCrypt and TrueCrypt hash only the last 1 MB of a file, if the file is larger than 1 MB. So hashing such a big file is useless anyways.

HACKERALERT commented 1 year ago

Ah okay. Yes, keyfiles are read into memory and hashed directly without reading it in chunks. I suppose this is somewhat of an oversight on my end because I didn't expect anyone to be using multi-GB keyfiles (which I think is a reasonable assumption). There's no benefit to a 20 GB keyfile over a 1 KB keyfile as long as there is enough entropy (which the built-in keyfile generator provides). There's not much I can do about this because any changes would likely break compatibility with existing volumes. I think I'll pass on this one.

hakavlad commented 1 year ago

any changes would likely break compatibility with existing volumes

Why would hashing a file in chunks instead of hashing the whole thing be incompatible?

hakavlad commented 1 year ago

Encryption does not cause such a memory leak. Are key files hashed differently throughout encryption and decryption?

UPD: encryption can also cause the problem.

HACKERALERT commented 1 year ago

Encryption does not cause such a memory leak. Are key files hashed differently throughout encryption and decryption?

Not quite sure what you mean, but encryption is done in 1 MB chunks whereas the keyfiles are read entirely into memory before being hashed. So for encryption, there isn't any "unnecessary" usage of memory but for keyfiles, there is. As long as you don't use large keyfiles, memory usage will be reasonable and eventually free itself.

Why would hashing a file in chunks instead of hashing the whole thing be incompatible?

Sorry, I was thinking about BigPanda's comment when writing this. Yeah, using only the first or last 1 MB will break compatibility for keyfiles larger than 1 MB. Hashing the keyfiles in chunks should be fine though. However, I still do not see much benefit and doubt that many people would be using such large keyfiles.

hakavlad commented 1 year ago

If you don't consider memory leaks a problem, users should at least be warned about the risks of memory leaks in the documentation.

HACKERALERT commented 1 year ago

A memory leak is when a program unintentionally uses memory without releasing it. In this case, nothing about the behaviour is unexpected per se, since the code dictates to read the entire keyfile into memory which is exactly what the program does. Perhaps it makes more sense to call it a poor design choice, though I would say it is a poor idea to be using such large keyfiles in the first place.

hakavlad commented 1 year ago

since the code dictates to read the entire keyfile into memory

What about #168? I saw picocrypt only read the first GiB, not the entire file.

In any case, the hashing of the files occurs in a sub-optimal way, and this can be corrected without breaking compatibility.

HACKERALERT commented 1 year ago

Let's move this over to #168 to keep things organized.

HACKERALERT / Picocrypt

Memory leak during decryption with keyfile #165