BitLocker-opencl segfault

solardiz commented 4 years ago

When running --test --format=opencl on a V100 on AWS, most runs crash at BitLocker-opencl (but one run did not). However, when running --test --format=bitlocker-opencl, that one benchmark on its own succeeds.

I caught one of those crashes in gdb:

Benchmarking: BitLocker-opencl, BitLocker [SHA256 AES OpenCL]... LWS=32 GWS=81920 (2560 blocks) 
Thread 1 "john-avx2" received signal SIGSEGV, Segmentation fault.
0x000000000071b50e in set_key (key=<optimized out>, index=<optimized out>) at opencl_bitlocker_fmt_plug.c:720
720     }
Missing separate debuginfos, use: debuginfo-install glibc-2.26-34.amzn2.x86_64 keyutils-libs-1.5.8-3.amzn2.0.2.x86_64 krb5-libs-1.15.1-37.amzn2.2.2.x86_64 libcom_err-1.42.9-12.amzn2.0.2.x86_64 libcrypt-2.26-34.amzn2.x86_64 libgomp-7.3.1-6.amzn2.0.4.x86_64 libselinux-2.5-12.amzn2.0.2.x86_64 openssl-libs-1.0.2k-19.amzn2.0.3.x86_64 pcre-8.32-17.amzn2.0.2.x86_64 zlib-1.2.7-18.amzn2.x86_64
(gdb) bt
#0  0x000000000071b50e in set_key (key=<optimized out>, index=<optimized out>) at opencl_bitlocker_fmt_plug.c:720
#1  0x6f4d656c69466563 in ?? ()

(gdb) disass $pc-40,$pc+1
Dump of assembler code from 0x71b4e6 to 0x71b50f:
   0x000000000071b4e6 <set_key+86>:     mov    %eax,%edi
   0x000000000071b4e8 <set_key+88>:     callq  0x404f10 <memcpy@plt>
   0x000000000071b4ed <set_key+93>:     cmpb   $0xa,-0x70(%rbp)
   0x000000000071b4f1 <set_key+97>:     je     0x71b4fd <set_key+109>
   0x000000000071b4f3 <set_key+99>:     lea    -0x8(%r12),%esi
   0x000000000071b4f8 <set_key+104>:    cmp    $0x2f,%esi
   0x000000000071b4fb <set_key+107>:    jbe    0x71b50f <set_key+127>
   0x000000000071b4fd <set_key+109>:    add    $0x98,%rsp
   0x000000000071b504 <set_key+116>:    pop    %rbx
   0x000000000071b505 <set_key+117>:    pop    %r12
   0x000000000071b507 <set_key+119>:    pop    %r13
   0x000000000071b509 <set_key+121>:    pop    %r14
   0x000000000071b50b <set_key+123>:    pop    %r15
   0x000000000071b50d <set_key+125>:    pop    %rbp
=> 0x000000000071b50e <set_key+126>:    retq   

(gdb) x/16x $rsp
0x7fffffffca68: 0x69466563      0x6f4d656c      0x203a6564      0x0a383334
0x7fffffffca78: 0x61647055      0x654d6574      0x79726f6d      0x65707954
0x7fffffffca88: 0x34203a73      0x39343932      0x39323736      0x6e490a35
0x7fffffffca98: 0x61697469      0x657a696c      0x74737953      0x654d6d65

Looks like a stack overwrite in set_key with some ASCII string:

$ printf "\x63\x65\x46\x69\x6c\x65\x4d\x6f\x64\x65\x3a\x20\n"
ceFileMode:

solardiz commented 4 years ago

@e-ago Would you be interested in debugging and fixing this?

solardiz commented 4 years ago

The #4309 fix might or might not have worked around this issue, but either way set_key must not overwrite its stack frame even when called with an incorrectly long key. So even if the issue is no longer reproducible as above, it still needs a separate fix.

magnumripper commented 4 years ago

In Jumbo, formats' code can assume that no over-long key is passed to set_key(). If it is, it's a mode or core bug.

solardiz commented 4 years ago

Thanks, @magnumripper. Do we need to let formats assume that no over-long key is passed to set_key(), though? For slow hashes like this, it doesn't matter performance-wise, so I think it's better to use defensive programming. For fast hashes, it might sometimes matter somewhat, but when it does this indicates the format should have built-in mask support and that should be used for full performance anyway.

Do you know of specific formats that currently make this assumption, and why they choose to make it?

magnumripper commented 4 years ago

I'm pretty sure there are many of them, perhaps predominantly SIMD ones. Most are now using common-simd-setkey32.h so it's only one file to fix (if we want to). It does contain this:

#ifdef DEBUG
    /* This function is higly optimized and assumes that we are
       never ever given a key longer than fmt_params.plaintext_length.
       If we are, buffer overflows WILL happen */
    if (len > PLAINTEXT_LENGTH) {
        fprintf(stderr, "\n** Core bug: got len %u\n'%s'\n", len, _key);
        error();
    }
#endif

magnumripper commented 4 years ago

For fast hashes, it might sometimes matter somewhat, but when it does this indicates the format should have built-in mask support and that should be used for full performance anyway.

I can imagine how an OMP format could use built-in mask, but I'm not sure how we'd use it for fast SIMD formats?

solardiz commented 4 years ago

I can imagine how an OMP format could use built-in mask, but I'm not sure how we'd use it for fast SIMD formats?

SIMD-aware mask mode is difficult, but potentially even more efficient. I had thought of something like this (before we even got mask mode) for LM hashes specifically, where we'd be altering the already transposed key bits matrix, instead of having to transpose it for each set of hash computations like we do now (byte transpose in set_key and bit transpose in crypt_all).

But you're right, realistically we're not getting that functionality - it's too much work and it complicates further maintenance.

openwall / john

BitLocker-opencl segfault #4294