OV2 / RapidCRC-Unicode

Windows tool to quickly create and verify hash checksums
https://www.ov2.eu/programs/rapidcrc-unicode
GNU General Public License v2.0
289 stars 30 forks source link

Filename with symbols not being found during verification #87

Closed tERyceNzAchE closed 3 years ago

tERyceNzAchE commented 3 years ago

Filename: 11. ✓ CHALLENGE 1-2 - let, var and closures - SOLUTION.mp4

RapidCRC correctly hashes the file and stores the correct filename, however on verify it reads the filename as: 11. ✓ CHALLENGE 1-2 - let, var and closures - SOLUTION.mp4 and shows it as file not found.

This is on Windows 10 Pro (x64) with RapidCRC-Unicode 0.3.36.

OV2 commented 3 years ago

This is a problem of the codepage autodetection. It cannot correcty determine that this is utf8 from the text the hash file contains. You can fix it by configuring rcrc to read utf8 by default: image

The other option is to create hash files as UTF16, which contain a header that does not require auto detection.

tERyceNzAchE commented 3 years ago

On a Windows 10 system, why not assume UTF-8 as default? Even Notepad uses UTF-8 without BOM as the default.

OV2 commented 3 years ago

Notepad uses a better auto-detect algorithm that correctly determines UTF8. I might look into using a different algorithm if I find the time.

The setting does not default to utf8 to allow old hash files to work correctly, but it has been so long that I might simply change this for the next release.

tERyceNzAchE commented 3 years ago

Are old hash files ASCII?

OV2 commented 3 years ago

Old hash files were usually created in the codepage of the system they were created on, for example windows-1252 for western europe. Without the auto-detect these will be opened as utf8 and produce similar problems if they contain non-ascii characters.

But as I said, I'll probably just make utf8 the default and let those that encounter issues turn the auto detect back on.