rhash / RHash

Great utility for computing hash sums
http://rhash.sf.net
BSD Zero Clause License
586 stars 116 forks source link

Resumable hashing #94

Open DannyZB opened 5 years ago

DannyZB commented 5 years ago

Have any of you considered resumable hashing for rhash?

When hashing extremely large files, 20GB and up, being able to resume hashing from a previous position would help a ton.

Is this something you've considered?

rhash commented 5 years ago

It's an interesting feature request. It can be implemented by serializing internal librhash state into a "partly hashed" file.

But for now it's a low priority FR, so not sure when I get my hands to it.

DannyZB commented 5 years ago

You have knowledge of the library.

Can you put 15 min and give a rundown of where that code is and what essentially should happen?

I might look into implementing it, would rather know where to look without learning the entire code base

Its very useful for download automation where you need hashing, can be split into a piped stream into rhash. Partial hashing is necessary for crashes(long downloads tend to have issues)

I.e. a way to send in the "partially hashed" file or load it after a crash. The same code can be reused to increase stability during crashes.

When you hash a 50g file and it breaks in the middle that's a little nightmare scenario

milahu commented 1 year ago

see also https://stackoverflow.com/questions/2130892/persisting-hashlib-state

you really just have to save and load

rhash commented 1 year ago

Since bbbe1beae95217b458ba43d4a90b7858325cca45 librhash supports add rhash_import() and rhash_export() functions to save and load its internal state. Now it's not hard to support resumable hashing of single file.

Some things are not clear: