Open bhagemeier opened 1 year ago
Still a very much needed feature! @jessek any plans for it?
We have someone working on it now. The performance gain is not yet as much as we would have expected. Please stay tuned for updates.
The performance gain is not yet as much as we would have expected.
That's weird...Hopefully it will be optimized! :+1:
We have someone working on it now. The performance gain is not yet as much as we would have expected. Please stay tuned for updates.
How is your project going along? I am CPU bottle-necked using hashdeep, and would greatly love a "xxhashdeep" or similar. Even small improvements would be helpful.
Hi there,
at Juelich Supercomputing Centre, we've recently been researching convenient tools to generate and verify hash sums of large collections of data. The amounts we're typically talking about are in the area of several TB to PB. We've found hashdeep to be convenient and providing a good interface including parallelisation options that may be important to checksum and verify many small files.
We've also come across the xxHash algorithm, which has been specifically designed to create checksums over extremely large amounts of data.
We have found the commandline tools provided for xxHash to lack some functionality offered by hashdeep. Therefore, we propose to integrate xxHash into hashdeep to improve the support for use cases dealing with extremely large volumes of data. Moreover, we also support the idea of integrating Blake3, as mentioned in #397.
In the spirit of Open Source, we do offer our full support in doing the integration ourselves, but would like to learn about your willingness to include the code in the main branch afterwards. Additionally, if there were good reasons to omit algorithms such as xxHash or Blake3, please let us know about them.
In order to support our request in numbers, here's a comparison of various algorithms supported in hashdeep and xxHash on a 155GB data set of two files.
As you can see, xxHash it at least 5 times faster than the fastest algorithm supported by hashdeep.