openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.11k stars 2.08k forks source link

Audit rules engine optimizations #3517

Open magnumripper opened 5 years ago

magnumripper commented 5 years ago

See #3468, https://github.com/magnumripper/JohnTheRipper/commit/15163f45fa95d3f68c2ae58971f233d706d1279f#commitcomment-31692841 and https://github.com/magnumripper/JohnTheRipper/commit/79559008ace3797ab6d5b6c4da1a4711261b820a#commitcomment-31692820

magnumripper commented 5 years ago

@solardiz https://github.com/magnumripper/JohnTheRipper/commit/79559008ace3797ab6d5b6c4da1a4711261b820a#commitcomment-31692820

[List.Rules:AppendSeason]
- <* Az"[Ss$][uU][mM][mM][eE3][rR]"
- <* Az"[Ww][iI|][nN][tT+][eE3][rR]"
- <* Az"[Ff][aA][lL][lL]"
- <* Az"[Ss][pP][rR][iI][nN][gG]"
- <* Az"[Aa][uU][tT][uU][mM][nN]"
+ a6 Az"[Ss$][uU][mM][mM][eE3][rR]"
+ a6 Az"[Ww][iI|][nN][tT+][eE3][rR]"
+ a6 Az"[Ff][aA][lL][lL]"
+ a6 Az"[Ss][pP][rR][iI][nN][gG]"
+ a6 Az"[Aa][uU][tT][uU][mM][nN]"

I'm not sure this is right. Does the a command check against the format's genuine max length (truncation, like in descrypt), the format's implementation max length (JtR limitation), or/and the specified --max-length? If it includes a check against the format's genuine max length, then e.g. not trying to append "Summer" when there's room only for "Sum" is wrong, and will result in this ruleset cracking fewer passwords than before.

OK, the problem you mention is for real, but it's not new with aN (same problem with <+) and I can't see how your original rules.c would do any different? The check is against rules_max_length, which is set to format's reported max. length.

What we could do in Jumbo is this:

diff --git a/src/rules.c b/src/rules.c
index 705c841ae..c41d1c24b 100644
--- a/src/rules.c
+++ b/src/rules.c
@@ -619,6 +619,8 @@ void rules_init(struct db_main *db, int max_length)

        min_length = options.eff_minlength;
        skip_length = options.force_maxlength;
+       if (db->format->params.flags & FMT_TRUNC && !skip_length)
+               max_length = PLAINTEXT_BUFFER_SIZE - 3;

        if (max_length == rules_max_length)
                return;
magnumripper commented 5 years ago

(there's some more to it than the diff above, but that's the gist of it)

solardiz commented 5 years ago

@magnumripper I think you picked a wrong example. I intentionally chose "Summer" and not the numbers, etc. For the numbers, etc., we assume we also have rules that test shorter ones, so testing truncated numbers would be redundant. This is not the case for words like "Summer".

I don't understand the patch you propose, but FWIW it needs braces around db->format->params.flags & FMT_TRUNC.

magnumripper commented 5 years ago

I think you picked a wrong example.

OK, I thought you placed your comment near the rules you discussed. I edited my post. Anyway, that makes more sense. You mean we should only reject if we can't append even one letter of it.

magnumripper commented 5 years ago

For the Summer case, the only sane way to get it the way you want is to revert that change. Personally I'd rather have it with a6 but I can make a private rule set for that.

I should probably just ditch #3522. What it does is setting rules_max_length as well as * and related variables to RULE_WORD_SIZE whenever FMT_TRUNC but while that often is logical and good, it sometimes isn't (eg. '* would not truncate DEScrypt at 8).

I think we need to discuss exactly how and when we should take FMT_TRUNC in consideration.

magnumripper commented 5 years ago

Here's current behavior. @solardiz is there anything we want to change here? If not, I will add some of these details to the docs, and then I can go on with the OP possibly only reverting some rule changes.

->N reject this rule unless length N or longer is supported
-<N reject this rule unless length N or shorter is supported (--min-length)

The above refer to format's reported length, unless overridden with --min/max-len options. FMT_TRUNC is not considered. Also, if we're in first rule pass (--rules) and there will be a second one later (--rules-stack), both these flags are ignored as in "always accept", since the next rule may change the length.

#   for min_length
@   for (min_length - 1)
$   for (min_length + 1)
*   for max_length
-   for (max_length - 1)
+   for (max_length + 1)

"Here min/max_length is the minimum/maximum plaintext length supported for
the current hash type (possibly overriden by --min/max-len options)."

Same here, as stated in the text. And we're ignoring FMT_TRUNC. On a side note I can't find any use for "(min_length + 1)" while sometimes "(min_length - 2)" would be handy.

<N  reject the word unless it is less than N characters long
>N  reject the word unless it is greater than N characters long
_N  reject the word unless it is N characters long
'N  truncate the word at length N
aN  reject the word unless it will pass min/max lengths after
    adding N characters. Used for early rejection
bN  reject the word unless it will pass min/max lengths after
    removing N characters

Same here, regardless of FMT_TRUNC. And a3 Az"abc" should be the exact same thing as Az"abc" <+ >@ but faster. Oh but there is one difference: If we're in first rule pass (--rules) and there will be a second one later (--rules-stack), the aN and bN commands are ignored as in "always accept", since the next rule may change the length. Is this how we want it? If so, we should make same change to <N and >N as well.

Last of all, in out_OK:, we reject words shorter than format's min. length (possibly overriden by -min-len). And we reject over-long words if --max-len option was used. If it wasn't, we truncate if FMT_TRUNC and reject otherwise. Nothing of this happens if there's another rule coming (--rules-stack) - the word will be accepted without truncation.

Note that the --stdout option has a feature in that --stdout=N will set FMT_TRUNC whereas instead using --max-len=N will not - this is good for testing!

Also note that you can make eg. DEScrypt lose it's FMT_TRUNC property by using -max-len=8.

magnumripper commented 5 years ago

If we're in first rule pass (--rules) and there will be a second one later (--rules-stack), the aN and bN commands are ignored as in "always accept", since the next rule may change the length. Is this how we want it? If so, we should make same change to <N and >N as well.

I'm really not sure I want this. Perhaps that behavior should be dropped from aN and bN instead.

magnumripper commented 5 years ago

...and perhaps we don't want that even for -< and ->. The "NT" and "ShiftToggle" are good examples where not rejecting the rule will only lead to wasted effort for no outcome.

Perhaps we should have separate commands and flags doing the same as these "unless more rules are coming".

solardiz commented 5 years ago

I'm afraid this got beyond the complexity level that I currently have time to fully consider. :-(

So I'll add just one comment: in the "Summer" case, we might or might not want to append just the letter "S" (if that's the only one that fits) depending on whether we also have a rule that tries appending all uppercase letters (or all characters) or not. Ditto for 2 or maybe even 3 characters depending on whether we have such multi-character append rules. It's that tricky. OTOH, with such multi-character appends present, trying to append the word unconditionally may only result in relatively negligible overhead (like a few extra rules appending specific words on top of, say, 95^2 or 26^3 rules appending character combinations).

magnumripper commented 5 years ago

Should we perhaps revert most of 15163f45fa for this release? At least all changes to core rules.

3513

solardiz commented 5 years ago

@magnumripper Probably yes, assuming you have no time to properly test it and fix whatever needs fixing.

solardiz commented 1 year ago

@magnumripper The changes you made per this issue that are still not reverted remain problematic. Specifically, your uses of the a command are reasonable for --max-length, but are unreasonable for a hash type's truncation. Should we fix the rules (remove many but not all uses of the a command) or maybe fix/clarify the implementation/definition of a (make it check against --max-length only, unlike the < command)?

See e.g. the "Summer" example above, which I've just tested against --max-length=8 vs. --stdout=8 vs. cracking of a descrypt hash - out of these three, only the first one works as desired. I used --rules=':a6 Az"Summer"' for this test, which is similar to how you use this command.

You also use a lot of a0, which is similarly problematic - rejects candidates that would be truncated by the hash's maximum - but they were meant to be truncated, not rejected, unless we're sure the rule would only change anything beyond the maximum. I think in your uses there's no such assurance - e.g., you do a0 before s commands, which may substitute characters below the maximum.

That said, some of the uses of a look reasonable to me even given its current behavior. For example, this:

[List.Rules:Append2Letters]
a2 Az"[a-z][a-z]"
-c a2 Az"[A-Z][A-Z]"
-c a2 Az"[a-z][A-Z]"
-c a2 Az"[A-Z][a-z]"

is just right as long as we also have single-letter appends. There are many other such examples. However, this is wrong for the truncation case (it's right for the --max-length case):

[List.Rules:PrependNumNum]
-[c:] a2 \p[c:] A0"[0-9][0-9]"

I still don't nor expect to have time to dive into this, but I do feel the need to point out that this is still problematic.

magnumripper commented 1 year ago

@magnumripper The changes you made per this issue that are still not reverted remain problematic. Specifically, your uses of the a command are reasonable for --max-length, but are unreasonable for a hash type's truncation.

I kind of wish I had never implemented these but reverting all of it now would cause compatibility issues.

Should we fix the rules (remove many but not all uses of the a command) or maybe fix/clarify the implementation/definition of a (make it check against --max-length only, unlike the < command)?

A combination of both sounds like a plan. I should at least look into changing the behavior asap - hopefully a trivial fix.