ColumPaget / Hashrat

Hashing tool supporting md5,sha1,sha256,sha512,whirlpool,jh and hmac versions of these. Includes recursive file hashing and other features.
GNU General Public License v3.0
59 stars 9 forks source link

Unexpected result while hashing multiple times #8

Closed Mi-Al closed 6 years ago

Mi-Al commented 6 years ago

Hello! When I run echo -n 'test' | hashrat -type sha256,md5 I get: 9962791b4b77f382035b7869e0a8eaf8

But if I do it in a different way:

echo -n 'test' | sha256sum 
9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08  -
echo -n '9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08' | md5sum
3de1d73a19c7590ee66398a17e93bd67  -

I get 3de1d73a19c7590ee66398a17e93bd67. So the final result is different, and it looks like a bug in the program.

ColumPaget commented 6 years ago

No, it's not a bug, but it is an interesting issue that had me scratching my head for a bit. Utilities like sha256sum and md5sum etc, print the hash value as a hex-encoded string. But the hash value is really just a large number. Hashrat can print this number as octal, hex-encoded, base-64, or a number of other output formats. But internally, when it's doing repeated hashing, hashrat doesn't convert the hash values to hex, base64, or anything else.

So, inside hashrat the sha256 hash is done, and this produces a binary value, not a hex-encoded string. That is then fed into the md5 hash, which produces another binary value, which is then encoded as hex, base64 or whatever for output (unless we want to do more hashes, where it is passed along the pipeline unencoded).

In order to simulate this with sha256 sum and md5sum, we'd need some way of decoding the hex output into raw bytes. I found the xxd utility can do this:

echo -n 'test' | sha256sum 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08 - echo -n '9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08' | xxd -r -p - | md5sum 9962791b4b77f382035b7869e0a8eaf8 -

Which now produces the same output as hashrat.

But this method of feeding the output of one hashing program into another hadn't occurred to me before, and it's handy for checking that hashrat is working as expected, so I'll be checking for this kind of thing going forwards.