Open solardiz opened 5 years ago
Agreed - I thought I already added the H/H2 split.
I thought I already added the H/H2 split.
As I understand: to MD5 yes, to MD4 not yet.
A relevant question for us: do we still want to be able to use OpenSSL's MD[45] as well? I doubt it. While OpenSSL might provide assembly implementations, it also zeroizes memory on *_Final
, which for something as fast as MD[45] probably undoes any advantage the assembly code might have provided. Also, going inside of OpenSSL implementation-specific context structs in pbkdf2_hmac_md5.h is no good - we can do that reliably for our own implementation-specific context structs, but not for third-party ones (not long-term).
So I suggest we fully standardize on our own md[45][.ch] and minimize their differences between my upstream versions and JtR jumbo versions of them.
OTOH, this also enable us to introduce new APIs such as a combined InitUpdate call (initialize new context and do a first update on it at once, thereby saving on function call overhead). Or we can simply make Init macros in .h files, but this will increase code size in the callers without any speed advantage over the combined InitUpdate approach.
Are we talking about same thing?
$ git grep H[,2] md4.c
md4.c:#define H2(x, y, z) ((x) ^ ((y) ^ (z)))
md4.c: STEP(H, a, b, c, d, GET(0) + 0x6ed9eba1, 3)
md4.c: STEP(H2, d, a, b, c, GET(8) + 0x6ed9eba1, 9)
md4.c: STEP(H, c, d, a, b, GET(4) + 0x6ed9eba1, 11)
md4.c: STEP(H2, b, c, d, a, GET(12) + 0x6ed9eba1, 15)
md4.c: STEP(H, a, b, c, d, GET(2) + 0x6ed9eba1, 3)
md4.c: STEP(H2, d, a, b, c, GET(10) + 0x6ed9eba1, 9)
md4.c: STEP(H, c, d, a, b, GET(6) + 0x6ed9eba1, 11)
md4.c: STEP(H2, b, c, d, a, GET(14) + 0x6ed9eba1, 15)
md4.c: STEP(H, a, b, c, d, GET(1) + 0x6ed9eba1, 3)
md4.c: STEP(H2, d, a, b, c, GET(9) + 0x6ed9eba1, 9)
md4.c: STEP(H, c, d, a, b, GET(5) + 0x6ed9eba1, 11)
md4.c: STEP(H2, b, c, d, a, GET(13) + 0x6ed9eba1, 15)
md4.c: STEP(H, a, b, c, d, GET(3) + 0x6ed9eba1, 3)
md4.c: STEP(H2, d, a, b, c, GET(11) + 0x6ed9eba1, 9)
md4.c: STEP(H, c, d, a, b, GET(7) + 0x6ed9eba1, 11)
md4.c: STEP(H2, b, c, d, a, GET(15) + 0x6ed9eba1, 15)
As #2356 says: My ultimate goal is to get rid of external crypt/hash lib altogether. We nearly always need our "own" code anyway, for OpenCL, so there's no point in using SSL for CPU.
Are we talking about same thing?
Ah, right. So the H/H2 is already in there. Don't know how I missed that.
Where can I find your upstream code?
Where can I find your upstream code?
https://openwall.info/wiki/people/solar/software/public-domain-source-code
But I guess I'll take care of this issue myself.
BTW, this provides some speedup for formats that copy contexts, such as SIP (as reminded by #1990), EIGRP, HSRP, tacacs-plus:
+++ b/src/md5.h
@@ -34,7 +34,9 @@ typedef struct {
MD5_u32plus A, B, C, D;
MD5_u32plus lo, hi;
unsigned char buffer[64];
+#if !ARCH_ALLOWS_UNALIGNED
MD5_u32plus block[16];
+#endif
} MD5_CTX;
extern void MD5_Init(MD5_CTX *ctx);
We should probably make this change along with the update, or make sure it's preserved through the update if we make it separately. Ditto for MD4.
Jumbo uses a revision of my generic MD4 and MD5 implementations from some years back. I've since revised the upstream versions in several ways. Jumbo also made its changes. We should probably take my latest revisions and reintroduce only the changes that we actually still want to have.
One change we're missing is my addition of H/H2 split to MD4, similar to what I had introduced for MD5 earlier, to help the compiler detect the common subexpressions.
This commit by JimF gets in the way: abe2cdb591b03c871f2c0d6e046cbeed36c6dee4. As I understand, it primarily renames
a, b, c, d
to uppercaseA, B, C, D
, which is only made use of in pbkdf2_hmac_md5.h. I doubt matching OpenSSL's context layout in any other ways is relevant.