Open AlekseyCherepanov opened 5 years ago
I've found a hack to pass only the full baseword: wrap basewords into '\x02' and check the ends using rules, then delete '\x02'.
Well, a printable separator char for border would be ok too (e.g. --single=':=0# =m# Dm D0'
).
Not sure if there are other (faster) ways of achieving this already but otherwise we could add some option. Perhaps a new bool option SingleNoPairing
defaulting to false? On the other hand, if our format has a KPC of 24 or 2097152 anyway, why not instead have an option that merely limits it according to the KPC? For plaintext format IIRC KPC is always 1 (so would be equivalent to SingleNoPairing = Y
), but for eg. raw-md5 it can well be 24 or even 2097152 for OpenCL. Setting a lower limit is futile!
$ ./JohnTheRipper/run/john --format=plaintext --list=format-all-details
[...]
Min. keys per crypt 1
Max. keys per crypt 130
[...]
But KPC reported in single mode is 8:
$ cat t.conf
.include <john.conf>
[Local:Options]
SingleWordsPairMax = 0
$ echo '$0$H' > t.pw
$ ./JohnTheRipper/run/john t.pw --log-stderr --config=t.conf --single=none
[...]
Loaded 1 password hash (plaintext, $0$ [n/a])
[...]
0:00:00:00 - SingleWordsPairMax increased to 3 for high KPC (8)
[...]
KPC reported in single mode is from key_count
variable, it goes through multiple stages of tweaking in the code.
But there is a warning already that work below min. KPC is inefficient:
if (key_count < single_db->format->params.min_keys_per_crypt) {
if (john_main_process) {
fprintf(stderr,
"Note: Performance for this format/device may be lower due to single mode\n"
" constraints. Format wanted %d keys per crypt but was limited to %d.\n",
single_db->format->params.min_keys_per_crypt,
key_count);
}
log_event(
"- Min KPC decreased from %d to %d due to single mode constraints.",
single_db->format->params.min_keys_per_crypt,
key_count);
}
I guess if a user wants such option and format's KPC is high, then it should be the user's problem to provide appropriate number of candidates, e.g. using rules. But I like useful and efficient defaults.
I am not sure, but it seems that the point where you would like to increase number of candidates is before application of rules, so it is the number of words for rules and not the number of final candidates. So at this point in code, it is not possible to know the final number easily because there may be rejects. Is it so? (With the option for limit, it might work this way: emit baseword, process rules and fill the buffer, if there is space in buffer, emit another baseword but drop excessive candidates.)
SingleNoPairing
would not be enough, because there is word extraction that adds additional candidates too.
django-scrypt
has min. KPC 1, while single mode wants buffer of size 8:
$ ./JohnTheRipper/run/john --format=django-scrypt --list=format-all-details
[...]
Min. keys per crypt 1
Max. keys per crypt 1
[...]
[...]
0:00:00:00 - SingleWordsPairMax increased to 3 for high KPC (8)
Plaintext format shows that set_key()
is called only for real candidates. There is no hashing-wise overhead from under-filled buffer in single mode, i.e. when we have more than min. KPC but less than key_count
. Is this buffer needed to reduce overhead from set_salt()
needed to switch between hashes? Is this overhead important for slow hashes?
I think precise control is needed for very slow hashes mostly. Not all slow formats have min. KPC 1: bcrypt has min. KPC 3 (and max. KPC 3).
I recall single mode have a floor of 8 for something... I need to revisit the code.
So I tried 1 candidate per salt: p/s is 2 times lower than c/s for django-scrypt. I picked 40 lines and results match. I traced with gdb: set_key()
is called correct number of times. Then what is c/s? How is it computed?
I tried to raise number of candidates using rules. Results:
| Cand. | time (s) | speed |
|-------+----------+-------------------|
| 1 | 4.264 | 10.30p/s 20.10c/s |
| 2 | 6.070 | 13.79p/s 20.44c/s |
| 3 | 8.081 | 15.53p/s 20.53c/s |
| 4 | 9.905 | 16.66p/s 20.70c/s |
| 5 | 11.925 | 17.16p/s 20.49c/s |
| 6 | 13.755 | 17.78p/s 20.66c/s |
| 7 | 15.530 | 18.34p/s 20.88c/s |
| 8 | 17.606 | 18.44p/s 20.68c/s |
| 9 | 19.527 | 18.69p/s 20.71c/s |
single.conf (there is "^A" for '\x01' in real sj2.pw):
.include <john.conf>
[Local:Options]
SingleRetestGuessed = N
SingleWordsPairMax = 0
sj2.pw:
#cijoviro16#^Ascrypt$Q96YFTgDlIOj$15$8$1$64$IMYpKhIIHB14JzLfK8lNuTk2r0tVOWgwzFoaCQitmbAaFHIzg6G9FfwIBa7sPrBOVu0IUbBjBypJOYeLF3cWRQ==
#jojovisu84#^Ascrypt$nd7ylGyrMj33$15$8$1$64$dQGh4FA0g10+r5eMz5wXyUlZnUeKC/iocT4GMoiZzsnh2IvSun7SCijSpKNty3Dj0ta9cW7andKEmVWkHTP9Hg==
trace_django.gdb:
set pagination off
break main
run </dev/null --config=single.conf sj2.pw --field-separator-char='\x01' --single=':=0# =m# Dm D0'
del 1
break scrypt_set_key
break *fmt_django_scrypt.methods.set_salt
continue
while 1
continue
end
Trace:
$ gdb -x trace_django.gdb --batch ./JohnTheRipper/run/john 2>&1 | grep set_
Breakpoint 3, set_salt (salt=0x55555732e6d4) at django_scrypt_fmt_plug.c:195
Breakpoint 2, scrypt_set_key (key=0x55555914c740 "notastrongpassword", index=0) at django_scrypt_fmt_plug.c:274
Breakpoint 3, set_salt (salt=0x55555732e6d4) at django_scrypt_fmt_plug.c:195
Breakpoint 2, scrypt_set_key (key=0x55555914c740 "realmenuseJtR", index=0) at django_scrypt_fmt_plug.c:274
Breakpoint 3, set_salt (salt=0x555559150000) at django_scrypt_fmt_plug.c:195
Breakpoint 2, scrypt_set_key (key=0x7fffffffb0d0 "cijoviro16", index=0) at django_scrypt_fmt_plug.c:274
Breakpoint 3, set_salt (salt=0x5555591501e0) at django_scrypt_fmt_plug.c:195
Breakpoint 2, scrypt_set_key (key=0x7fffffffb0d0 "jojovisu84", index=0) at django_scrypt_fmt_plug.c:274
Breakpoint 3, set_salt (salt=0x555559150000) at django_scrypt_fmt_plug.c:195
Breakpoint 3, set_salt (salt=0x5555591501e0) at django_scrypt_fmt_plug.c:195
Then what is c/s? How is it computed?
In any cracking mode but single (as you probably know), it's like this (edited several times):
I believe hashcat is currently only showing c/s, which is a constant confusion to end users (given a thousand salts it might look like we're a thousand times slower 🙄).
In single mode however, all bets are off. Aren't p/s and c/s the same? The keyspace is different for each salt.
I found the reason: single mode does additional crypt_all()
without candidate. So we get adequate c/s and lower p/s. With more candidates, the stake of empty crypt_all()
is lower.
I made fake scrypt hashes to avoid rules:
$ printf '%03d:scrypt$%012d$15$8$1$64$aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa==\n' {1..4} > test2.pw
$ wc -l test2.pw
2 test2.pw
Changes in the tracer:
run </dev/null test2.pw --single=none --skip-self-tests
[...]
break *fmt_django_scrypt.methods.crypt_all
Output:
$ gdb -x trace_django.gdb --batch ./JohnTheRipper/run/john 2>&1 | grep -e set_ -e crypt_all | sed -e 's/ at .*//'
Breakpoint 3, set_salt (salt=0x55555914fb08)
Breakpoint 2, scrypt_set_key (key=0x7fffffffb0f0 "001", index=0)
Breakpoint 4, crypt_all (pcount=0x7fffffffac9c, salt=0x55555914faa0)
Breakpoint 3, set_salt (salt=0x55555914fcc0)
Breakpoint 2, scrypt_set_key (key=0x7fffffffb0f0 "003", index=0)
Breakpoint 4, crypt_all (pcount=0x7fffffffac9c, salt=0x55555914fc58)
Breakpoint 3, set_salt (salt=0x55555914fb08)
Breakpoint 4, crypt_all (pcount=0x7fffffffad5c, salt=0x55555914faa0)
Breakpoint 3, set_salt (salt=0x55555914fcc0)
Breakpoint 4, crypt_all (pcount=0x7fffffffad5c, salt=0x55555914fc58)
The following message suggests that the effect was expected:
Almost done: Processing the remaining buffered candidate passwords, if any.
That is strange. Can you reproduce it with John proper?
On Thu, Oct 10, 2019 at 06:19:26PM -0700, magnum wrote:
In any cracking mode but single (as you probably know), it's like this:
- p/s is how many candidates per second we're doing.
- c/s is p/s divided by number of unique salts
- C/s is c/s but showing the boost from same-salt hashes
This is only exactly correct when both nothing gets cracked and we are (and were) not running single mode (nor are in batch mode, where single mode is a step).
When the number of hashes (and maybe salts) changes as some hashes get cracked, these figures become progressively more independent, and you can't compute one from another.
They are average password candidates (p/s), hash computations (c/s), and tested {computed hash, stored hash} combinations (C/s) per second.
In single mode however, all bets are off. Aren't all three just the same?
No. They are still the averages as explained above.
Yes, the same happens with John proper.
Fake bcrypts:
$ printf '%03d:$2a$05$%021d.E5YPO9kmyuRGyh0XouQYb4YMJKvyOeW\n' {1..4} > bf2.pw
$ cat bf2.pw
001:$2a$05$000000000000000000002.E5YPO9kmyuRGyh0XouQYb4YMJKvyOeW
003:$2a$05$000000000000000000004.E5YPO9kmyuRGyh0XouQYb4YMJKvyOeW
john building with debugging symbols (add -g
to all commands, remove -s
for ld
):
john-1.9.0/src$ make OMPFLAGS=-g LDFLAGS= linux-x86-64-avx2
Investigate min/max kpc:
john-1.9.0/src$ gdb ../run/john
[...]
(gdb) break main
Breakpoint 1 at 0x555555559e90: file john.c, line 666.
(gdb) run
[...]
(gdb) p fmt_BF
[...]
min_keys_per_crypt = 3,
max_keys_per_crypt = 3,
[...]
Tracer:
set pagination off
tbreak main
run </dev/null bf2.pw --single
break *fmt_BF.methods.set_key
break *fmt_BF.methods.set_salt
break *fmt_BF.methods.crypt_all
continue
while 1
continue
end
I edited john.conf
to leave only :
in [List.Rules:Single]
.
Trace:
$ gdb -x trace_bf.gdb --batch ./john-1.9.0/run/john 2>&1 | grep -e set_salt -e crypt_all -e set_key | sed -e 's/ at .*//'
[...self tests...]
Breakpoint 3, set_salt (salt=0x5555557f6688)
Breakpoint 2, set_key (key=0x7fffffffb500 "003", index=0)
Breakpoint 4, crypt_all (pcount=0x7fffffffb0ac, salt=0x5555557f6648)
Breakpoint 3, set_salt (salt=0x5555557f6588)
Breakpoint 2, set_key (key=0x7fffffffb500 "001", index=0)
Breakpoint 4, crypt_all (pcount=0x7fffffffb0ac, salt=0x5555557f6548)
Breakpoint 3, set_salt (salt=0x5555557f6688)
Breakpoint 4, crypt_all (pcount=0x7fffffffb18c, salt=0x5555557f6648)
Breakpoint 3, set_salt (salt=0x5555557f6588)
Breakpoint 4, crypt_all (pcount=0x7fffffffb18c, salt=0x5555557f6548)
So additional crypt_all()
happens even if previous call was with number of candidates under min. KPC.
The following message suggests that the effect was expected:
Almost done: Processing the remaining buffered candidate passwords, if any.
I was wrong: this message is printed before candidates when there is 1 candidate per salt. So it is not related to the empty crypt_all()
in the end.
I am not sure that it is empty, it may check the last candidate against every salt, but I did not check that. Anyway the additional calls to crypt_all()
seem wrong.
Knowing precise total number of candidates, it is possible to exit before the additional calls using --max-candidates=
option.
In order to spot an excessive call to crypt_all()
you really need to monitor set_salt()
and [at least index 0 of] set_key()
. I'd also add one to clear_keys()
if there is one. Just throw some standard debug print, like:
fprintf(stderr, "%s() at %s:%u\n", __FUNCTION__, __FILE__, __LINE__);
if (index == 0)
fprintf(stderr, "%s() at %s:%u\n", __FUNCTION__, __FILE__, __LINE__);
I'm currently doing stuff within this context, will look at what's mentioned here as well.
Whatever stuff we add should probably be covered by #4408 as well (along with the normal brief OPTIONS description).
I did not find any option to disable word extraction fully.
"Fully" would end up with us having zero words... Do you mean from login field, or what? Or maybe you mean the word splitting, like taking the first character of gecos and use as a word of its own (the "q" ripped from "qwe,567"), and things like that. We can probably add means to disable that (or perhaps optionally only allow it up to a single batch of keys, somehow).
We have SingleSkipLogin, perhaps we could also have the complement of that, such as SingleOnlyLogin.
I did not find way to disable generation of pairs, because SingleWordsPairMax is raised automatically
We now have ~I have a pending patch that adds~ --single-pair-max=N
where N can be zero - and when the command-line option is used, the auto-bump is disabled. Seems to work fine. Having said that, I've never been interested in limiting single mode this way, quite the opposite! As long as things are tried in a sensible order, what's the problem with it digging out many morphemes and pairs? Just like with incremental mode, you can abort it once the crack rate has decreased to "too boring".
Or put it like this: Perhaps we should instead focus on optimizing the order of words, morphemes and pairs?
additional crypt_all() happens even if previous call was with number of candidates under min. KPC.
It looks to me the "extra" calls to crypt_all seen above is because we test the last batch of keys with other salts. That's why no set_key
is seen. This is already configurable, with SingleRetestGuess option.
Looks like @magnumripper worked on some aspects of this issue, but doesn't intend to do anything else on it. If so, should we close this issue? Maybe with a brief summary on what was done, and what was not and why not. Thanks.
CMIYC 2019 created interest to feed candidates into single mode with high precision because the hashes are very slow (real world example: Ashley Madison leak). E.g. it may be desirable to use only specified candidates without any additional candidates.
I'll mix several problems, because probable goal may be reached not fixing all of them.
Example, current state:
I would like to be able to try only
qwe,567
.Optionally, with 'SingleSkipLogin = N', it may be useful to be able to try only
abc,123
andqwe,567
candidates (but not pairs with them).john is quite fresh: