openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.37k stars 2.11k forks source link

misleading description of source() method and undefined behaviour due to invalid pointers #5528

Open AlekseyCherepanov opened 2 months ago

AlekseyCherepanov commented 2 months ago

I was working on a tracer for formats. It prints all arguments of methods. Particularly I tried to print ciphertext argument of source() method. Wrapping fmt_default_source caused stable crashes, so I checked sources and it was obvious that db is adjusted expecting that new source() would reconstruct ciphertext. I was adding my wrappers after init(), so I assumed that the problem was that db was initialized with one configuration and used with other.

So I tried to wrap only non-default source() and print ciphertext as regular string. It mostly worked but there was a crash late during --test=0. Again it was ciphertext argument not pointing to memory. Also the crash was weird because test case could be reduced only to --format='HAVAL-128-4,hdaa,HMAC-SHA1,HMAC-SHA512,dynamic_0' and nothing less (and even order was important).

As far as I understand, john might or might not adjust db to remove ciphertexts. In any case, custom source() should use binary argument and ignore ciphertext. Is that right?

And there are 2 problems:

The wording:

/* Reconstructs the ASCII ciphertext from its binary (saltless only).
 * Alternatively, in the simplest case simply returns "source" as-is. */
    char *(*source)(char *source, void *binary);

Let's remove my tracing code and add just printf("%s\n", ciphertext); into static char *source(...) in dynamic_fmt.c. The crash repeats:

$ gdb -ex 'set pagination 0' -ex run -ex bt --batch --args ../run/john --test=0 --format='HAVAL-128-4,hdaa,HMAC-SHA1,HMAC-SHA512,dynamic_0'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Testing: HAVAL-128-4 [32/64]... PASS
Testing: hdaa, HTTP Digest access authentication [MD5 128/128 SSE4.1 4x3]... PASS
Testing: HMAC-SHA1 [password is key, SHA1 128/128 SSE4.1 4x]... PASS
Testing: HMAC-SHA512 [password is key, SHA512 128/128 SSE4.1 2x]... PASS
Testing: dynamic_0 [md5($p) (raw-md5) 128/128 SSE4.1 4x3]... 
Program received signal SIGSEGV, Segmentation fault.
__strlen_sse2 () at ../sysdeps/x86_64/multiarch/strlen-sse2.S:142
142 ../sysdeps/x86_64/multiarch/strlen-sse2.S: No such file or directory.
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/strlen-sse2.S:142
#1  0x00007ffff77b7994 in __GI__IO_puts (str=0xad5f187cad5f187c <error: Cannot access memory at address 0xad5f187cad5f187c>) at ./libio/ioputs.c:35
#2  0x000055555563e656 in source (source=<optimized out>, binary=0x5555596ffc30) at dynamic_fmt.c:2721
#3  0x000055555589583b in ldr_load_pw_line (db=db@entry=0x555556f41ca0, line=line@entry=0x7fffffffd880 "") at loader.c:1002
#4  0x00005555558981fc in ldr_init_test_db (format=format@entry=0x7ffff7e87010, real=real@entry=0x0) at loader.c:1308
#5  0x0000555555877924 in benchmark_all () at bench.c:878
#6  0x0000555555890be5 in john_run () at john.c:1681
#7  main (argc=<optimized out>, argv=0x7fffffffe088) at john.c:2101

printf("%s\n", ciphertext); was optimized into puts(ciphertext); but we still can see the argument's value:

#1  0x00007ffff77b7994 in __GI__IO_puts (str=0xad5f187cad5f187c <error: Cannot access memory at address 0xad5f187cad5f187c>) at ./libio/ioputs.c:35

0x ad5f187c ad5f187c does not seem to be a pointer, also gdb says it cannot access memory there.

formats.h could be more explicit that the simplest case should be fmt_default_source and it cannot be anything else.

solardiz commented 2 months ago

Thanks. I don't remember this stuff well enough to comment without reviewing the code first.

By ciphertext, do you mean the char *source argument?

AlekseyCherepanov commented 2 months ago

Ouch! Yes, I meant the char *source argument. Also I should copy-paste actual piece of code inserted in source() in dynamic_fmt.c: printf("%s\n", source);.

AlekseyCherepanov commented 2 months ago

creation of invalid pointer is undefined behaviour in C

It is not true on its own. Some ways to get an invalid pointer might be UB though, e.g. some_array - 1 (link).