openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.36k stars 2.11k forks source link

dynamic format can't handle some combinations of interleaving factors #1127

Closed magnumripper closed 9 years ago

magnumripper commented 9 years ago

First seen on 32-bit CLANG:

make -s distclean && ./configure CC="clang -m32" --host=i686-mac-darwin --enable-asan --disable-openmp && make -sj8

I guess ASan wasn't enabled after all (we should add the alternatives, that's what JTR_FLAG_CHECK is for!)

checking if clang -m32 supports -fsanitize=address... no

but anyways

$ ../run/john -test=0 -form:cpu
(...)
Testing: dynamic_17 [phpass ($P$ or $H$) 128/128 AVX 4x4x4]... Warning: binary() returned misaligned pointer
PASS
Testing: dynamic_18 [md5($s.Y.$p.0xF7.$s) (Post.Office MD5) 32/32 128x1 (MD5_body)]... PASS
Testing: dynamic_19 [Cisco PIX (MD5) 128/128 AVX 8x4x4]... PASS
Testing: dynamic_20 [Cisco ASA (MD5 salted) 128/128 AVX 8x4x4]... PASS
Testing: dynamic_22 [md5(sha1($p)) 128/128 AVX 8x4x3]... Segmentation fault: 11
magnumripper commented 9 years ago

The "Warning: binary() returned misaligned pointer" is fixed in 84e7319. The segfault is not addressed yet.

magnumripper commented 9 years ago
(gdb) r
Starting program: ../run/john -test=0 -form:dynamic_22
Testing: dynamic_22 [md5(sha1($p)) 128/128 AVX 8x4x3]... -
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x9ce03824
0x0003eab8 in hex_out_buf [inlined] () at dynamic_big_crypt.c:134
134         *((unsigned short*)cpo) = itoa16_w2[*cpi++];
(gdb) bt
#0  0x0003eab8 in hex_out_buf [inlined] () at dynamic_big_crypt.c:134
#1  0x0003eab8 in large_hash_output (cpi=<value temporarily unavailable, due to optimizations>, cpo=0x9ce03824 <Address 0x9ce03824 out of bounds>, in_byte_cnt=20, tid=769) at dynamic_big_crypt.c:262
magnumripper commented 9 years ago

The "Warning: binary() returned misaligned pointer" is fixed in 84e7319

Apparently this did not help (it seemed to help at the time). Then I'm not sure how to fix that, other than to do a tiny alloc at some good spot in the code.

jfoug commented 9 years ago

binary() aligned should be fixed here 60090b4

jfoug commented 9 years ago

Not sure where problem is here. It is doing a 20 byte hex dump (40 bytes) into the input_buf2_X86 array (probably element #0). The input_buf2_X86 is of type MD5_IN. That is a union (double aligned), where it is a buffer PLAINTEXT_LENGTH_X86 + EX_BUF_LEN. I think the size has been hand crafted to be 256 bytes long IIRC. I do not see where the overflow could even remotely be coming from, unless there is some compiler optimization causing issues. The code is VERY hard to read just what it is doing, but I would assume the compiler would have no problems at all 'reading' it.

On your copy, try to remove the inline from hex_out_buf (line 128 or so in dyna_big_crypt). We can NOT run that way. This IS an inline that matters. I just want to see if the problem goes away, forcing the compiler to build a stack frame (well to call a function).

magnumripper commented 9 years ago
Testing: dynamic_22 [md5(sha1($p)) 128/128 AVX 8x4x3]... |
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x36b61833
DoSHA1_FixBufferLen32 [inlined] () at /Users/magnum/src/john/src/dynamic_big_crypt.c:1118
1118        input_buf[total_len] = 0x80;
(gdb) bt
#0  DoSHA1_FixBufferLen32 [inlined] () at /Users/magnum/src/john/src/dynamic_big_crypt.c:1118
#1  0x00041263 in DoSHA1_crypt_sse () at dynamic_big_crypt.c:1173
(gdb) print inbut_buf
No symbol "inbut_buf" in current context.
(gdb) print total_len
$1 = 892678707
magnumripper commented 9 years ago

Using a debug build

Testing: dynamic_22 [md5(sha1($p)) 128/128 AVX 8x4x3]... -
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x64fc555b
DoSHA1_FixBufferLen32 (input_buf=0x18d0900 "", total_len=1668238427) at dynamic_big_crypt.c:1118
1118        input_buf[total_len] = 0x80;
(gdb) bt    
#0  DoSHA1_FixBufferLen32 (input_buf=0x18d0900 "", total_len=1668238427) at dynamic_big_crypt.c:1118
#1  0x0003fcf0 in DoSHA1_crypt_sse () at dynamic_big_crypt.c:1173
(gdb) frame 1
#1  0x0003fcf0 in DoSHA1_crypt_sse () at dynamic_big_crypt.c:1173
1173            loops[i] = DoSHA1_FixBufferLen32(cp, ilen[i]);
(gdb) print i
Cannot access memory at address 0x1

WTF? i is a stack variable. How can it be a pointer to 0x1?

jfoug commented 9 years ago

Men overwrite ? Totllen = 1.6 billion. But not happening on other systems. Hard for me yo ddbg when I can't replicate

jfoug commented 9 years ago

Now that I do most testing with cygwin64 I've sort of ignored a lot of 32 bit stuff. For a while I was about the only one carrying 32 bit testing on. With all the resent reorg we've done, it's not surprising there have been 32 bit issues popping up

magnumripper commented 9 years ago

After all changes to dynamic for AVX2 support, I re-tested this (at 3082d00913) but the problem remains the same.

frank-dittrich commented 9 years ago

Looks like https://github.com/magnumripper/JohnTheRipper/issues/1282 is the same issue (32bit Linux with clang)

magnumripper commented 9 years ago

It seems so - I just confirmed the problem is gone in OSX when using gcc.

../run/john --test=0 -form:dynamic
(...)
All 110 formats passed self-tests!
magnumripper commented 9 years ago

This is about interleaving factors mismatch (as Frank noticed). Here's a failing build

$ for i in 0 30 26; do ../run/john -test=0 -form:dynamic_$i ; done
Testing: dynamic_0 [md5($p) (raw-md5) 128/128 AVX 4x4]... PASS
Testing: dynamic_30 [md4($p) (raw-md4) 128/128 AVX 4x3]... PASS
Testing: dynamic_26 [sha1($p) raw-sha1 128/128 AVX 4x3]... Segmentation fault: 11

So we use 4x (16) for MD5, 3x (12) for MD4 and 3x (12) for SHA1.

We also use some BLOCK_LOOPS figure, to do more at a time. Let's check them too:

$ for i in 0 30 26; do ../run/john --list=format-all-details -form:dynamic_$i | grep "Max. keys"; done
Max. keys per crypt                  128
Max. keys per crypt                  128
Max. keys per crypt                  128

Ok, here's the problem: You can divide 128 by 16 but not by 12. This can be handled in the code of course, but maybe it's not. Let's have a look at a working build:

Testing: dynamic_0 [md5($p) (raw-md5) 128/128 AVX 10x4x3]... PASS
Testing: dynamic_30 [md4($p) (raw-md4) 128/128 AVX 10x4x3]... PASS
Testing: dynamic_26 [sha1($p) raw-sha1 128/128 AVX 10x4x2]... PASS

Max. keys per crypt                  120
Max. keys per crypt                  120
Max. keys per crypt                  120

In this case, we have interleaving factors of 2 and 3, and use a mkpc of 120 which can be divided by any of them.

frank-dittrich commented 9 years ago

Just changing MD5_SSE_PARA in x86-sse.h from 4 to 3 makes all self tests pass on my 32bit Linux AVX2 system with clang. And the md5crypt and sunmd5 c/s rates also improve: 61568 -> 63504 724 -> 864

magnumripper commented 9 years ago

This may be the whole problem

#   define MAX_KEYS_PER_CRYPT  (((SIMD_COEF_32*BLOCK_LOOPS)/(MD5_SSE_PARA*4))*(MD5_SSE_PARA*4))

It only ensures MD5 gets a key per crypt that is a multiple of interleaving factor.

magnumripper commented 9 years ago

And the md5crypt and sunmd5 c/s rates also improve: 61568 -> 63504 724 -> 864

That's clang-version dependent and off-topic for this issue. But it should be looked into.

frank-dittrich commented 9 years ago

This may be the whole problem

# define MAX_KEYS_PER_CRYPT (((SIMD_COEF_32_BLOCK_LOOPS)/(MD5_SSE_PARA4))(MD5_SSE_PARA_4))

It only ensures MD5 gets a key per crypt that is a multiple of interleaving factor.

That would mean the other *_SSE_PARAs would need to be considered here as well, but I noticed I only need to change MD5_SSE_PARA from 4 to 3 to make --format=dynamic work for 32bit clang.

magnumripper commented 9 years ago

I noticed I only need to change MD5_SSE_PARA from 4 to 3 to make --format=dynamic work for 32bit clang.

Sure, because it happened to make it same as MD4 and "compatible" with SHA1. On THAT very system. You could probably have fixed it by changing MD4 to 4 just as well.

What we really need is a least common denominator or whatever. No matter how we tweak this mess, it will be fragile - I think we should actually rewrite some code. This is rather ridiculous.

magnumripper commented 9 years ago

I'm not sure how to attack this. The code is just b0rken, assuming things that really can not be assumed. We've had several problems with this before but only with OpenMP.

jfoug commented 9 years ago

Please test with this patch 290887c

magnumripper commented 9 years ago

Works like a champ.