openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.27k stars 2.1k forks source link

hybrid mask with truncation by --stdout=N strange behaviour #3761

Open AlekseyCherepanov opened 5 years ago

AlekseyCherepanov commented 5 years ago

Reposting with minor modifications from comment to #3731.

Same lengths of passwords, same options, but different letters cause different results.

$ printf 'abcdef\nabcde\nabcd\n' > wl1

$ printf 'aaaaaa\nbbbbb\ncccc\n' > wl2

$ cat wl1
abcdef
abcde
abcd

$ cat wl2
aaaaaa
bbbbb
cccc

$ john --wordlist=wl1 --stdout=9 --mask='?w?W'
Using default input encoding: UTF-8
abcdABCD
1p 0:00:00:00 100,00% (2019-04-05 23:11) 12.50p/s abcdABCD

$ john --wordlist=wl2 --stdout=9 --mask='?w?W'
Using default input encoding: UTF-8
aaaaAAAA
bbbbBBBB
ccccCCCC
3p 0:00:00:00 100,00% (2019-04-05 23:11) 27.27p/s ccccCCCC

$ john --list=build-info
Version: 1.8.0.16-jumbo-1-bleeding-7187208 2019-04-05 21:04:49 +0200
magnumripper commented 5 years ago

Yep, I can reproduce (thanks!). This is so remarkably strange I almost don't dare debugging it 😨

magnumripper commented 5 years ago

Here's a hint...

$ ../run/john --wordlist=wl1 --stdout -max-len=9 --mask='?w?W'
Using default input encoding: UTF-8
abcdABCD
1p 0:00:00:00 100.00% (2019-04-08 16:59) 12.50p/s abcdABCD

$ ../run/john --wordlist=wl2 --stdout -max-len=9 --mask='?w?W'
Using default input encoding: UTF-8
aaaaAAAA
bbbbBBBB
ccccCCCC
3p 0:00:00:00 100.00% (2019-04-08 16:59) 42.85p/s ccccCCCC

Here I used -min-len=9 instead of -stdout=9. I got the same output, but in this case it could be considered wrong in a "different" way than in your case. Anyway somewhere, some code doesn't differentiate between them.

Now, the expected results here may be arguable. For -stdout=9 you'd normally expect it to join overlong words (eg. abcdefABCDEF) and then truncate to 9 (for abcdefABC) because that's what the rules engine would do. And for -max-len=9 you might expect it to drop any input words longer than 4 right away, not spending any time at all on them, because the result can only be over 9 so should be rejected.

Our hybrid modes however have a more simplistic approach: If max total length is 9, mask mode will initially see that we need input words of at most length 4, and tell the parent mode so. Apparently, in both cases "wordlist" will truncate the input before sending it to mask mode for ?w?W.

So why was the result different? Simple. The actual result from wl1 was that all three input words were truncated to abcd and resulted in this:

abcdABCD
abcdABCD
abcdABCD

...and thankfully the consecutive-dupe suppression dropped all but one 😄

magnumripper commented 5 years ago

Now, the actual problems:

Expected results from wl1 would be

abcdefABC
abcdeABCD
abcdABCD

and wl2

aaaaaaAAA
bbbbbBBBB
ccccCCCC

Expected results from wl1 would be

abcdABCD

and wl2

ccccCCCC

I have no intentions to fix this before the nearby release, but possibly for next.