Open magnumripper opened 4 years ago
Also reproducible on AWS:
$ ./john -stress-test=0 -form=mscash2-opencl
GPU memory at start: 0 MiB
#1 Testing: mscash2-opencl, MS Cache Hash 2 (DCC2) [PBKDF2-SHA1 OpenCL]... Device 1: Tesla V100-SXM2-16GB
PASS
403 MiB
#2 Testing: mscash2-opencl, MS Cache Hash 2 (DCC2) [PBKDF2-SHA1 OpenCL]... Segmentation fault
#2 Testing: mscash2-opencl, MS Cache Hash 2 (DCC2) [PBKDF2-SHA1 OpenCL]...
Thread 1 "john" received signal SIGSEGV, Segmentation fault.
0x0000000000763531 in set_key (
key=0x3ffd220 "80808080\200\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064", index=0)
at opencl_mscash2_fmt_plug.c:185
185 key_host[index][i] = key[i] ;
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.amzn2.0.2.x86_64 gmp-6.0.0-15.amzn2.0.2.x86_64 keyutils-libs-1.5.8-3.amzn2.0.2.x86_64 krb5-libs-1.15.1-37.amzn2.2.2.x86_64 libcom_err-1.42.9-12.amzn2.0.2.x86_64 libcrypt-2.26-34.amzn2.x86_64 libgomp-7.3.1-6.amzn2.0.4.x86_64 libselinux-2.5-12.amzn2.0.2.x86_64 openssl-libs-1.0.2k-19.amzn2.0.3.x86_64 pcre-8.32-17.amzn2.0.2.x86_64 zlib-1.2.7-18.amzn2.x86_64
(gdb) bt
#0 0x0000000000763531 in set_key (
key=0x3ffd220 "80808080\200\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064\065\066\067\070\071\060\061\062\063\064", index=0)
at opencl_mscash2_fmt_plug.c:185
#1 0x0000000000685f84 in fmt_self_test_body (full_lvl=<optimized out>, db=0x60f0bf0, salt_copy=0x60f0ce4, binary_copy=0x60ddf14,
format=0xcb0f00 <fmt_opencl_mscash2>) at formats.c:848
#2 fmt_self_test (format=format@entry=0xcb0f00 <fmt_opencl_mscash2>, db=db@entry=0x60f0bf0) at formats.c:1647
#3 0x0000000000677836 in benchmark_all () at bench.c:921
#4 0x000000000068ea6c in john_run () at john.c:1795
#5 main (argc=<optimized out>, argv=<optimized out>) at john.c:2206
(gdb) disass $pc-41,$pc+40
Dump of assembler code from 0x763508 to 0x763559:
0x0000000000763508 <set_key+72>: je 0x76356d <set_key+173>
0x000000000076350a <set_key+74>: cmp $0x2,%r9
0x000000000076350e <set_key+78>: je 0x763560 <set_key+160>
0x0000000000763510 <set_key+80>: cmp $0x3,%r9
0x0000000000763514 <set_key+84>: je 0x763553 <set_key+147>
0x0000000000763516 <set_key+86>: cmp $0x4,%r9
0x000000000076351a <set_key+90>: je 0x763546 <set_key+134>
0x000000000076351c <set_key+92>: cmp $0x5,%r9
0x0000000000763520 <set_key+96>: je 0x763539 <set_key+121>
0x0000000000763522 <set_key+98>: cmp $0x6,%r9
0x0000000000763526 <set_key+102>: jne 0x7635e3 <set_key+291>
0x000000000076352c <set_key+108>: movzbl (%rbx,%r11,1),%r12d
=> 0x0000000000763531 <set_key+113>: mov %r12b,(%rdi,%r11,1)
0x0000000000763535 <set_key+117>: add $0x1,%r11
0x0000000000763539 <set_key+121>: movzbl (%rbx,%r11,1),%esi
0x000000000076353e <set_key+126>: mov %sil,(%rdi,%r11,1)
0x0000000000763542 <set_key+130>: add $0x1,%r11
0x0000000000763546 <set_key+134>: movzbl (%rbx,%r11,1),%edx
0x000000000076354b <set_key+139>: mov %dl,(%rdi,%r11,1)
0x000000000076354f <set_key+143>: add $0x1,%r11
0x0000000000763553 <set_key+147>: movzbl (%rbx,%r11,1),%ecx
0x0000000000763558 <set_key+152>: mov %cl,(%rdi,%r11,1)
End of assembler dump.
(gdb) i r
rax 0x7d 125
rbx 0x3ffd220 67097120
rcx 0x0 0
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x7fffffffcbd0 0x7fffffffcbd0
rsp 0x7fffffffcbc0 0x7fffffffcbc0
r8 0x7e 126
r9 0x6 6
r10 0x0 0
r11 0x0 0
r12 0x38 56
r13 0x3fff2c8 67105480
r14 0x0 0
r15 0xcb0f00 13307648
rip 0x763531 0x763531 <set_key+113>
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
This also occurs with john
binary from 2 months ago, so isn't a recent regression.
mscash2 can't be run by --stress-test
. The patch below fixes the segfault, but it continues to fail because OpenCL initialization is not done correctly. Or shared code is not able to handle the needs of mscash2, or it is not using it properly.
diff --git a/src/opencl_mscash2_fmt_plug.c b/src/opencl_mscash2_fmt_plug.c
index b6bbcf430..4a4e628fb 100644
--- a/src/opencl_mscash2_fmt_plug.c
+++ b/src/opencl_mscash2_fmt_plug.c
@@ -55,6 +55,7 @@ typedef struct {
static cl_uint *dcc_hash_host ;
static cl_uint *dcc2_hash_host ;
+static unsigned int initialized;
static unsigned char (*key_host)[MAX_PLAINTEXT_LENGTH + 1] ;
static ms_cash2_salt currentsalt ;
static cl_uint *hmac_sha1_out ;
@@ -84,8 +85,6 @@ static void init(struct fmt_main *__self)
static void reset(struct db_main *db)
{
- static unsigned int initialized;
-
if (!initialized) {
unsigned int i;
self->params.max_keys_per_crypt = 0;
@@ -145,6 +144,7 @@ static void done(void) {
MEM_FREE(key_host) ;
MEM_FREE(hmac_sha1_out);
releaseAll();
+ initialized = 0;
}
static int valid(char *ciphertext, struct fmt_main *self)
magnum edit: #4398 added the above
Are we confident this issue only affects stress-test? If so, we can reasonably change the issue description to state that and move it to label "Potentially 1.9.0-jumbo-2 material".
The patch below fixes the segfault
I don't mind getting this in for now. Please feel free to send a PR. Thanks!
Are we confident this issue only affects stress-test?
I'm not aware of (other) use cases that restart the execution environment like this:
opencl-init()
format-init()
format-done()
opencl-done()
openl-init()
and so on
Me neither. It's easy to think Sayantan was sloppy, but he wasn't - at the time it wasn't ever an issue and I'm not sure we even had any done()
in the format interface. We fired up a format and used it, then exited. It took a while until we even saw the problem with a no-format --test.
OpenCL initialization is not done correctly
I just found cosmetic side effect of that. Line about device is printed after format name. Usually device is in the very first line.
$ ./run/john --device=2 --test --format=mscash2-opencl
Benchmarking: mscash2-opencl, MS Cache Hash 2 (DCC2) [PBKDF2-SHA1 OpenCL]... Device 2: Hawaii [AMD Radeon (TM) R9 390 Series]
DONE
Raw: 137885 c/s real, 138571 c/s virtual
Also it does not do auto-tuning. And it does not print LWS/GWS.
@AlekseyCherepanov mscash2-opencl
is special - e.g., it is the only format that can use multiple GPUs without and pre-dating fork. You want to be the one to bring/rewrite it to our current standards?
Oh, I forgot that mscash2 is special. I saw that in the docs.
doc/README-OPENCL
:
Currently only mscash2-OpenCL support multiple devices by itself. However, all other formats can use it together with MPI or the --fork option.
Actually from this wording I thought mscash2-opencl is more advanced than other formats and hence more modern.
Changing unfamiliar code would be too much for me now. So I will not pick it up in any visible future.
This indicates problems in eg.
done()