The threading option is very efficient on filesystems with physical drives which have neglectable random I/O seeking times (like SSDs). But on harddrives, the seeking times between two threads reading files in parallel can actually defeat the performance gain and may be worse than single threading. I know that -j0 will cure the thrashing problem, but then hashdeep will only to read-compute-read-compute sequentially.
Suggestions:
Add an option to have one thread that does the I/O (or use a semaphore/lock that blocks the threads from doing I/O operations at the same time). This way, I/O will operate concurrently and is only bound by /either/ CPU or I/O.
To improve this approach (with more than one computing threads), one could pick the next file to hash biggest/smallest alternating. This will avoide starvation on either the I/O or CPU side.
The threading option is very efficient on filesystems with physical drives which have neglectable random I/O seeking times (like SSDs). But on harddrives, the seeking times between two threads reading files in parallel can actually defeat the performance gain and may be worse than single threading. I know that -j0 will cure the thrashing problem, but then hashdeep will only to read-compute-read-compute sequentially.
Suggestions: