OV2 / RapidCRC-Unicode

Windows tool to quickly create and verify hash checksums
https://www.ov2.eu/programs/rapidcrc-unicode
GNU General Public License v2.0
300 stars 29 forks source link

[Feature Request] Make Blake3 available in NTFS Streams #177

Closed Thunderbolt32 closed 2 years ago

Thunderbolt32 commented 2 years ago

I would be happy if a more robust hash method than CRC32 could be stored in NTFS streams like the fast Blake3 hashes. Would it be possible to make storing and reading in NTFS Streams available in RapidCRC Unicode for this hash algorithm, too?

The advantage of NTFS streams is that the hash value is sticky and robust against e.g. move and rename operations, especially if these are then also carried out by other tools (e.g. TinyMediaManager). With hash files, the file location references would become inconsistent, whereas the NTFS streams remain linked.

Thank you very much.

OV2 commented 2 years ago

I've started to add this feature, will hopefully be in the next version.

Thunderbolt32 commented 2 years ago

Many many thanks in advance!

OV2 commented 2 years ago

I've implemented this, similar to how the CRC value was saved. While looking through other issues I noticed that TeraCopy also has the ability to save hash values to a stream, but uses a different format (which looks like it is taken from TalAloni/MD5Stream). So I'm currently considering if it is worth it to also switch to this format.

Thunderbolt32 commented 2 years ago

MD5Stream

Let's look for an example of TalAloni/MD5Stream cat :MD5:$DATA

 MD5=EBC4ED3AC805B99666C3D8DBFF3FA7DB
 FileWriteTimeUtc=2022-09-16T16:43:32.2740000Z

MD5Stream does have a additional feature: It distinguishes between incorrect checksums differently depending on whether the file modification date is more recent than the checksum generation date. Whether this is a use case (KISS principle) remains to be seen. It might also help RapidCRC users to see a different error message. However, MD5Stream lacks the command/query separation programming principle and auto-updates the checksum immediately.

But introducing DSL File Formats like MD5Stream does for it - and like TeraCopy do it with it's Costum-MD5-Files (see issue https://github.com/OV2/RapidCRC-Unicode/issues/95) is bullshit for me. Especially if it is badly application-specific designed.

One Suggestion

If I have to, my mantra would be: Don't treat NTFS streams like something special, but only as a second storage method. So I would simply store an ordinary checksum file in the NTFS Stream. There are three aspects of this:

  1. The current file name is also saved. It would be ignored when reading a single-file sticky NTFS stream.
  2. If I had to, the timestamp would be stored in a comment to keep the file format compatible with everything else, so other checksum-checking-tools can simple ignore it and we don't have to introduce a new Domain-Specific-Language (DSL).
  3. If we don't use a shorthand-writing (like a single asertisk *) for the filename and store the full filename, we could (if we want) move the checksum-file out from the NTFS Stream to a normal File System File and everything would be fine. Magic! 🌟

Additionally: Peace with NTFS Streams 🌈: If we already store full checksum files in the NTFS stream, checksum files attached to NTFS streams of folders would also be trivial to implement, as the behaviour would then be the same as if the file was simply stored inside the folder.

Edit: Sorry, my suggestion was a bad idea, too: https://xkcd.com/927/

Finally

But i don't care. The old behavior of RapidCRC Unicode for CRC32 in NTFS streams (store raw checksum value) is easy to understand and the behavior is also easy to motivate for implementation in other checksum tools - and it is space-efficient! So KISS-Principle: How and why a checksum is no longer correct is not an usual task of a checksum programme. Reading capability of the other formats would be a fine, but I would not adapt them. At least, that's my opinion.

Thunderbolt32 commented 2 years ago

Instead of using TeraCopy or something, my thought would be to parse the checksums out of the NTFS streams with jonelo/jacksum, for example. The raw checksum makes it a little easier here, that's the point of my view to leave RapidCRC Unicode's behaviour as it is.

Thunderbolt32 commented 2 years ago

@OV2 Can you provide a new prebuilt binary for me? I want to see how your changes behave and can't get into the C/C++/C#-Dev-Environment-Setup. ^^

OV2 commented 2 years ago

Sorry about the delay, I got held up with some covid issues... I'll try to do a new release when I get home today.

Thunderbolt32 commented 2 years ago

I'm sorry for pressuring you. I was just a bit unsure, as there hasn't been a release in over a year. Take the time you need.

OV2 commented 2 years ago

No worries, asking doesn't hurt. I've uploaded a new release you can try.

Thunderbolt32 commented 2 years ago

I've added the BLAKE3 checksum to the NTFS Stream of a file and RapidCRC Unicode recognise and checks it. But I don't quite understand how to make RapidCRC Unicode store a BLAKE3-Checksum into the NTFS Stream. The "Shell Extension" Configuration seems to be unchanged and I also can't find and option to display a Button like "BLAKE3 into NTFS Stream" or something.

2022-10-12_14h43_34

OV2 commented 2 years ago

It works similar to the "into filename" button - if you right click on the "into ntfs stream" button you get a popup menu with all the hashes.

Thunderbolt32 commented 2 years ago

Hidden like an Easteregg. ^^ Thankyou. It work's!

I believe RapidCRC Unicode is now a PowerUser tool, as problems now arise from the new possibilities. It is entirely up to you whether you want to address them. For my part, I can handle it without any problems and just document them. ;)

I think i will close this issue then, since this feature request is fully adressed. So thankyou very much. ;)

OV2 commented 2 years ago

Maybe I can find some way to better signal this to users. I did not want to turn the whole button into a dropdown (like the sha and blake buttons), since it would introduce an extra click for "normal usage". But maybe that wouldn't really matter - I'll think about it.