jessek / hashdeep

Other
694 stars 130 forks source link

Include option to be used with -f <file> to accept null-character (zero byte) delimiting for file names. #416

Open aghsmith opened 1 month ago

aghsmith commented 1 month ago

hashdeep can accept a list of files to be hashed, which can be very useful given its otherwise limited file selection ability.

I have used it with an invocation like: hashdeep -c md5 -f <(find . -type f ) together with some other filters for find, which in the most part works. However files in Unix like file systems can have \r and \n (carriage-return and line-feed) characters in their names.

Hashdeep is too simple in accepting only line separations as the file name delimiters with the -f option.

A file like: touch filename$'\r' cannot be accepted though hashdeep if received through find like this, though hashdeep can handle the file on its own without receiving the file list through an external file list..

Find has an option though to zero separate file names: find . -type f -print0

Currently hashdeep is unable to process this (unless there is only one file in the list)

It seems like this would be a fairly easy feature to develop and would be very helpful for handling the edge-cases with -f <file>

aghsmith commented 1 month ago

I asked a question on stackexchange about how to solve my problem before realising that there probably isn't a good solution: https://stackoverflow.com/questions/78530775/trouble-with-hashdeep-fed-by-find-and-unusual-characters

BTW, the reason I'm not just using hashdeep's own file selection is we have a directory structure that puts snapshot directories in some of the directories we need to hash. We would wind up having hashdeep computing the hashes of essentially the same files more than once, which is time consuming, even if the snapshot related lines of output are removed later.