openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.24k stars 2.1k forks source link

Sync md[45].[ch] with upstream #3917

Open solardiz opened 5 years ago

solardiz commented 5 years ago

Jumbo uses a revision of my generic MD4 and MD5 implementations from some years back. I've since revised the upstream versions in several ways. Jumbo also made its changes. We should probably take my latest revisions and reintroduce only the changes that we actually still want to have.

One change we're missing is my addition of H/H2 split to MD4, similar to what I had introduced for MD5 earlier, to help the compiler detect the common subexpressions.

This commit by JimF gets in the way: abe2cdb591b03c871f2c0d6e046cbeed36c6dee4. As I understand, it primarily renames a, b, c, d to uppercase A, B, C, D, which is only made use of in pbkdf2_hmac_md5.h. I doubt matching OpenSSL's context layout in any other ways is relevant.

magnumripper commented 5 years ago

Agreed - I thought I already added the H/H2 split.

solardiz commented 5 years ago

I thought I already added the H/H2 split.

As I understand: to MD5 yes, to MD4 not yet.

solardiz commented 5 years ago

A relevant question for us: do we still want to be able to use OpenSSL's MD[45] as well? I doubt it. While OpenSSL might provide assembly implementations, it also zeroizes memory on *_Final, which for something as fast as MD[45] probably undoes any advantage the assembly code might have provided. Also, going inside of OpenSSL implementation-specific context structs in pbkdf2_hmac_md5.h is no good - we can do that reliably for our own implementation-specific context structs, but not for third-party ones (not long-term).

So I suggest we fully standardize on our own md[45][.ch] and minimize their differences between my upstream versions and JtR jumbo versions of them.

OTOH, this also enable us to introduce new APIs such as a combined InitUpdate call (initialize new context and do a first update on it at once, thereby saving on function call overhead). Or we can simply make Init macros in .h files, but this will increase code size in the callers without any speed advantage over the combined InitUpdate approach.

magnumripper commented 5 years ago

Are we talking about same thing?

$ git grep H[,2] md4.c
md4.c:#define H2(x, y, z)                       ((x) ^ ((y) ^ (z)))
md4.c:          STEP(H, a, b, c, d, GET(0) + 0x6ed9eba1, 3)
md4.c:          STEP(H2, d, a, b, c, GET(8) + 0x6ed9eba1, 9)
md4.c:          STEP(H, c, d, a, b, GET(4) + 0x6ed9eba1, 11)
md4.c:          STEP(H2, b, c, d, a, GET(12) + 0x6ed9eba1, 15)
md4.c:          STEP(H, a, b, c, d, GET(2) + 0x6ed9eba1, 3)
md4.c:          STEP(H2, d, a, b, c, GET(10) + 0x6ed9eba1, 9)
md4.c:          STEP(H, c, d, a, b, GET(6) + 0x6ed9eba1, 11)
md4.c:          STEP(H2, b, c, d, a, GET(14) + 0x6ed9eba1, 15)
md4.c:          STEP(H, a, b, c, d, GET(1) + 0x6ed9eba1, 3)
md4.c:          STEP(H2, d, a, b, c, GET(9) + 0x6ed9eba1, 9)
md4.c:          STEP(H, c, d, a, b, GET(5) + 0x6ed9eba1, 11)
md4.c:          STEP(H2, b, c, d, a, GET(13) + 0x6ed9eba1, 15)
md4.c:          STEP(H, a, b, c, d, GET(3) + 0x6ed9eba1, 3)
md4.c:          STEP(H2, d, a, b, c, GET(11) + 0x6ed9eba1, 9)
md4.c:          STEP(H, c, d, a, b, GET(7) + 0x6ed9eba1, 11)
md4.c:          STEP(H2, b, c, d, a, GET(15) + 0x6ed9eba1, 15)
magnumripper commented 5 years ago

As #2356 says: My ultimate goal is to get rid of external crypt/hash lib altogether. We nearly always need our "own" code anyway, for OpenCL, so there's no point in using SSL for CPU.

solardiz commented 5 years ago

Are we talking about same thing?

Ah, right. So the H/H2 is already in there. Don't know how I missed that.

magnumripper commented 4 years ago

Where can I find your upstream code?

solardiz commented 4 years ago

Where can I find your upstream code?

https://openwall.info/wiki/people/solar/software/public-domain-source-code

But I guess I'll take care of this issue myself.

solardiz commented 3 years ago

BTW, this provides some speedup for formats that copy contexts, such as SIP (as reminded by #1990), EIGRP, HSRP, tacacs-plus:

+++ b/src/md5.h
@@ -34,7 +34,9 @@ typedef struct {
        MD5_u32plus A, B, C, D;
        MD5_u32plus lo, hi;
        unsigned char buffer[64];
+#if !ARCH_ALLOWS_UNALIGNED
        MD5_u32plus block[16];
+#endif
 } MD5_CTX;

 extern void MD5_Init(MD5_CTX *ctx);

We should probably make this change along with the update, or make sure it's preserved through the update if we make it separately. Ditto for MD4.