nikola / gruft

A fast implementation of cryptographic hash functions in pure JavaScript: MD5, SHA-1, SHA-256, and TIGER/192.
http://nikola.github.io/gruft/
Other
13 stars 4 forks source link

Tiger192 hashing problem #1

Open heliofh opened 9 years ago

heliofh commented 9 years ago

I am performing several tests with the Tiger192 library against the NppCalc implementation.

The library works really well, and faster than I expected. After performing some thousands of tests, I might have found a bug:

Input:
kå-Ôµ¦¨·G:ípЋNTHãŒÒ
It contains non-printable chars, so it is better to use its hex value:
6BE5192DD4B5A6A8B7473AED70D08B1F4E541248E38C1AD2

Output (gruft 0.2.2):
70d21ab3d715412d582876ce7b373212cd729c85d936a1c4 (BE*)
2d4115d7b31ad2701232377bce762858c4a136d9859c72cd (LE*)

Output (Pascal - NppCalc 1.3 in Notepad++):
434420381EDE8D3815ACEC394A59C99A9B2F56EC09F7CECA (BE)
388DDE1E382044439AC9594A39ECAC15CACEF709EC562F9B (LE converted from BE)

Output (Java - Jacksum 1.7.0):
388dde1e382044439ac9594a39ecac15cacef709ec562f9b (LE) << confirms NppCalc

* BE for Big Endian (MSB-first) and LE for Little Endian.

If I take out the last byte (D2) of the input, the outputs match in these three implementations.

I'm going to run some more tests. If it is really a bug, I don't know if the culprit is gruft-common.js (less likely) or gruft-tiger192.js (those s-boxes in base91 can be tricky). As far as I can tell for now, this bug is really rare to happen and probably has nothing to do with padding. I've tested gruft 0.2.2 with the same results in Firefox and Chrome.

Best regards, Hélio

nikola commented 9 years ago

@heliofh thank you for bringing this to my attention. I can confirm that for the input that you supplied the TIGER/192 hash does not match what other tools compute.

The issue might be in initial MD4 padding of the input. Right now I'm using simple string concatenation to append the padding. In your example input, the string terminates with an incomplete unicode codepoint. Appending the MD4 padding with the + operator might be tripping up the JS interpreters' view of the string, as JS does not actually have a true byte-sized sequence type.

Changing the input to this (second-last char from 0x1A to 0x19) does compute the correct digest:

6BE5192DD4B5A6A8B7473AED70D08B1F4E541248E38C19D2

-> 6a4bb51ec4eee9881e5d25b5b6d34512bed72ff27f9897ff

I'll be investigating this over the next few days.

heliofh commented 9 years ago

Hi. Thanks for the quick response.

I've done about 300.000 tests with SHA-256 and MD5 using inputs with the same length of the output. Not even a single error. With tiger, there's at least one error for about 4.000 semi-random inputs.

So maybe only the poor Tiger is sick :(

Two more inputs that seems to be generating wrong outputs. That might help with debugging:

**Input**:
=uôâ³h³¿¼'oQGرۡ·P¤ñ¹
3D75F4E2B368B3BFBC276F5147D80AB1DBA1B750A4F1B912
**Output gruft 0.2.0**:
9947e65f33480de4a4cf08bb90dc51f23079a249da012cec (BE)
**Output NppCalc**:
BE8649269189068F83D45C0334982874D9C3045E9DF5A0F1 (BE)
8F068991264986BE74289834035CD483F1A0F59D5E04C3D9 (LE)
**Output jacksum 1.7.0**:
8f068991264986be74289834035cd483f1a0f59d5e04c3d9 << matches NppCalc
---------------

**Input**:
Sp?_ÎPÐÜÕ'?5ø{)Ù­H)$?ÒÔO
5370925FCE50D0DCD5279935F87B29D9AD4829249BD2D44F
**Output gruft 0.2.0**:
ae44b0284e2354d34377084661db2ce10a4076d1e691b2b4 (BE)
**Output NppCalc**:
55204A77FE8B55D7307758585BCE53CE95DF19B71920FBEB (BE)
D7558BFE774A2055CE53CE5B58587730EBFB2019B719DF95 (LE)
**Output jacksum 1.7.0**:
d7558bfe774a2055ce53ce5b58587730ebfb2019b719df95 << matches NppCalc

I am beginning with JavaScript, but as far as I know even those non-ASCII characters directly derived from bytes are handled as 1 byte (when they match ISO-8859-1) or 2 bytes (when they are inside the BMP - Basic Multilingual Plane). But I honestly don't know where to look for errors right now. If I find out something, I'll post again.

Regards

nikola commented 9 years ago

@heliofh the MD4 padding is implemented here:

https://github.com/nikola/gruft/blob/master/src/gruft-tiger192.js#L153

It's heavily tuned for speed, so it is certainly possible that I introduced a bug in there that only shows for certain input block lengths, e.g. 192 bits.

heliofh commented 9 years ago

A few more tests that go wrong, this time with smaller strings (32 and 24 bits). The first one is full ASCII.

Hex         Str
--------------------------
7B746376    {tcv
858EA4A5    …Ž¤¥
6920A8E7    i ¨ç
BB9A6132    »ša2
74805C      t€\

If you still have some old versions of the code, it's possible to check whether it was performing correctly or not. (If I've checked correctly, only newer versions are available here, which is understandable.) The problem occurs so rarely that I'm more suspicious of some S-box that is rarely used than a problem with padding, which occurs for every input. I'm also suspicious about the base91 decoding for those S-boxes, as it's being used for Tiger but not for SHA-256 or MD5. (I guess you did that trying to keep files smaller, because Tiger have unusually large boxes.)

Last but not least, you are doing a great job with this library. I'll try to help. And as far as I can tell, it's the only JS Library that supports Tiger natively. :+1:

nikola commented 9 years ago

@heliofh yes, base94-encoding of the S-boxes results in much smaller source code.

Essentially, the library was written in 2006 when transparent GZip'ing of JS source files was not as prevalent on the majority of servers as it is today, so back then I decided to compact the S-boxes. Though I also verified that the decoded S-boxes were identical to the originals. I'd be very surprised if that's the problem.

Out of curiosity, what do you intend to use it for?

heliofh commented 9 years ago

Well... then I guess it's probably not the s-boxes nor the decoding. I don't know if this is good or bad.

My use for it is actually quite silly. As I trying to learn some things in Cryptography and JavaScript, I've decided that I could practice it doing an interesting webpage. I like the idea of having a master password to derive other passwords from. Just like softwares like the PasswordMaker does. But that software was abandoned and it only hashed passwords twice at maximum (in HMAC mode). The "distance" between the master password and the child passwords is too small. And It is really non-standardized. I want one that uses bcrypt, scrypt, pbkdf2 or n-times iterated hash functions (user choice). And Tiger is a good function for that scenario (good in software, not so good in hardware and not so popular in usage). So far I have a source code that is one file, big, ugly and messy. :(

But I can do lots of testings, it works well and I'm learning a lot. JavaScript is better than I've ever imagined. :)

nikola commented 9 years ago

@heliofh that's very interesting. I wish I could help out, but at the moment I'm very busy with other projects.

I've rewritten the MD4 padding in my Tiger/192 implementation today, but it seems that wasn't the culprit. I've also verified that, at least, the first and last elements of the decoded 4 S-box list constants are correct, so that's not the issue, either.

I'm afraid to say that at this point the most likely reason is that in this huge unrolled loop there is a mistake somewhere:

https://github.com/nikola/gruft/blob/master/src/gruft-tiger192.js#L189

Unfortunately, I wrote this almost 9 years ago, and in the meantime migrated my work computer several times, so I don't have access to the historical files anymore.

In any case, as it happens one of my lecturers back in university found a series of weaknesses in the Tiger/192 hash function itself, so I wouldn't recommend using Tiger/192 today. If you are seeking state-of-the-art hash function properties you might want to check out Keccak, which is also available in Javascript.

Ginden commented 8 years ago

JS does not actually have a true byte-sized sequence type.

In fact, it have - UInt8Array is such byte sequence. JavaScript strings can be converted using modern browser TextEncoder API: new TextEncoder("utf-8").encode(str) // returns byte array or Node.js Buffer class new Uint8Array(new Buffer(str, 'utf8')) // return byte array too

coldcoff commented 7 years ago

Note that a solution might be to use the "buffer-enabled" fork: https://github.com/mhertsch/gruft as Buffer is a true byte-sequence and can be brought to browsers as well (for example via browserify). @nikola, maybe you want to review his changes and to adopt that (the commit says it is only a partial integration so far for only tiger192, so there's analogous work to do for the other algorithms). Having support for UInt8Arrays as @Ginden suggests might also be a good idea.

When using the current nikola/gruft one has to ensure (if possible) that the "to-be-hashed-input" (message) has length == byte_length, i. e. is not using any multi-byte chars. The comment from @nikola itself in https://github.com/nikola/gruft/blob/master/src/gruft-tiger192.js#L165 "Assume that only 1 byte wide characters are used." says it all ... It might be a good idea to reject string input that contains multibyte chars and to enforce using buffer or Uint8Array input in such cases.