Closed frank-dittrich closed 9 years ago
Definitely. I thought I added an issue similar to this already but perhaps I didn't. This can be made with just one format (any), because we are testing a mode, not a format.
But may be we shouldn't use the fastes format, otherwise, --max-run-time=1 might complete the task without the need to restore, especially if you also use --fork. So, completing the task without the need to restore should cause a warning.
I am building this now. I am using the pw-new.dic, with the 50 at tail appended a few times. I did it for md5-crypt. It seems to take long enough to have multiple runs. We may have to add a number to the -restore switch (it is just a boolean now), so that you can specify number of seconds per run. Right now, I do a 10s run. But there may be fast enough machines that 10s does not cause enough retries.
This will be a self contained mode, just like internal. It runs, and the function does not return, it exits the script returning proper error code.
Can we do this test for core??? I am not sure. I am using --max-run-time which I am 99% sure is a jumbo feature. I am not sure that I could do this for core.
I guess we could thread this somehow (or even just using & to put process in background), and then signal it. Speaking of running that way, that might be a good 2nd test even for jumbo. john likely may behave differently with a signal restart vs a -max-run-time restart. I do not know the restore (or save session) code that well, so I am not 100% sure just what to test.
do you two have any thoughts here? I can probably get a POC done and checked in shortly. I guess once we have it working, we can expand upon it if we need to.
core doesn't have --max-run-time, but Iguess if jumbo passes this test we can safely assume core will pass as well.
You could use ulimit -t 10 ./john ...
(and ignore $?
when john gets killed for exceeding ulimit), but I don't think it is necessary.
$? already is 256 (1) when we hit the time limit. When john completes prior to time limit, return is 0. That is my 'current' check. I am also trying to get the ETA lines to output (as a how it is going), but perl is being a pisser for me.
https://github.com/magnumripper/JohnTheRipper/issues/1859#issuecomment-152661870
For the wordlist + other modes, we need to make sure that the test is long enough, and with enough breaks, that multiple rules are tested, along with different stopping points within the word file. So we may want a longer wordlist and fewer rules, and a shorter wordlist with LOTS of rules.
Also, we may want to perform some tests using -fork and restore (with the above modes also)
We should probably use one of the slowest formats for this, as we need to crack every single generated candidate in order to know we didn't miss one at restore. Perhaps some PBKDF2 with 10000 iterations. We should also use at least 10 salts I guess (so we don't miss catching a bug where not all salts got tested after a stop-restore cycle). Or maybe 100 salts and fewer iterations. We could pick any custom number of iterations, targeting a convenient speed.
Isn't the current implementation producing 3886 candidates while only having 1994 hashes to crack? There's a significant risk we pass a test even though several candidates were missed.
Ah, no, it rejects input longer than 15 chars so we only have 1994 candidates too. At least for SIMD builds, it's fine.
Added wordlist+rules. Using korelogic ^d^d^d rules. I did a shuf | head 4500, then created the wordlist with 38 byte passwords (3200 of them). Then salted-sha1 was used. Takes a couple minutes to run, and I have 10s runtimes. Should be a pretty valid test.
e313cc7
Having issues running this on my VM (64 bit). Had to remove the 'extra' passwords added, and increase time that the exits happen at. Not sure why ....
Do we have a method that will exit john after X number of candidates are found?
$ ../run/john -w=pw-new.dic -rules=PrependNumNumNum -stdout=39 -v:1
7770000p 0:00:00:01 100.00% (2015-11-04 19:33) 6815Kp/s 999ko}GpppVeLj
You test 7,770,000 candidates against 3,200 hashes? If there are a few missed candidates at resume, how would you be sure they are among these 3,200?
Well look at it this way. I have a random 'mix' of 3200 hashes created out of those 7M possible data. This is no different than any other run of the TS. There are candidates which are not part of the 'found' set (or actually ANY password attack). All that matters is when things are all said and done, that all 3200 were found.
how would you list to 'test' this ? I guess we could do a super slow hash, with a 'packed' list of data.
If you can come up with test data, and a set of john tests runs (well, just 1 "run', and then multiple -rest=xxx runs), I will certainly put it into the TS.
The only safe way to test it is to have a hash list that need all the generated candidates in order to fully crack. I believe the first (wordlist only) test is fine as-is but this last one has only a 1/2500 chance of catching a missed candidate. We have to be sure any one missed candidate is catched or this will be pretty much useless.
Maybe we could simply use the default wordlist: ../run/john -w -ru:appendNumNum
- that would total about 35000 candidates. So we create salted-sha1 hashes for them, possibly with just eg. 1000 different salts (trying to keep the speed up). We can modulate speed with number of unique salts but in the end, every single candidate should crack exactly one hash. And we probably want to ensure it gets resumed at least 10 times so we may have to aim for ~15 seconds (if ran without --max-run-time) on a very fast system.
If that ends up fine, we can add an almost identical test for wordlist + mask: ../run/john -w -mask:?w?d?d
, using the same hash list.
Note that all we need to test here is resume. We don't have to involve any over-length candidates or things like that.
I can experiment a little with it later unless you beat me to it. I'm currently working on truecrypt-opencl.
Do we have a method that will exit john after X number of candidates are found?
Using an external filter, we can abort after X number of candidates are tried. See [List.External:AutoAbort]
I created a 35k test, and it was done in 20s (35k of salts). I then changed the rules to do this:
$[0-9]$[1-2] and ended up with 70k. This file took 80s, which is 'about' right for doing 8s max run times. It should take 10-12 restarts to get it all (depends upon how much 'replay' there is. Since it is only 20 rules, there may be a 'lot'.
I'd prefer a mask '?w?l?d'
or '?w?l?s'
over '?w?d?d'
, because it is more likely that the number of candidates generated per input word is not a multiple of max. keys per crypt (or vice versa).
Hmm the batch size of salted sha1 (avx) for 8 threads is 32K, that's too large for this little test. Maybe we should run it with OMP_NUM_THREADS=1 and --mkpc=4093
or some other prime number. That would mitigate @frank-dittrich's worries too.
Then you have to disallow --mkpc= for -passthru for this specific test, or replace 4093 with the value given on the command line.
Using salted-sha1 seemed to be 'hard' to get it to work right all the time. It would get in a loop easily, where it would restart far enough back, that it would find no candidates.
So I made a wordlist of 20 'garbage' words, and used appendnumnum rule (100 rules). I then did a pass_gen creation of bitcoin (which has OMP scale of 1), and set the rounds really low (2000).
Now running on a 10s max time, I get about 3-4% gain each 10s loop (shows up using -v)
I think this is a better test. I will get it cleaned up, and checked in.
--rules=appendNumNum and -mask=?w?d?d work fine with same input file. I will get things checked in. Wasn't mask-hybrid the problem recently? If so, can we make sure the TS fails against the broken version.
NOTE, it is safer to run in the -v mode, so that there is some screen interaction, and so you can watch to make sure the sessions do not get 'locked up'
WOOT WOOT. Shows the bug just fine.
$ ./jtrts.pl -restore
-------------------------------------------------------------------------------
- JtR-TestSuite (jtrts). Version 1.13, Dec 21, 2014. By, Jim Fougeron & others
- Testing: John the Ripper 1.8.0.6-jumbo-1-1562-ga7456a8+ OMP [cygwin 64-bit AVX-ac]
--------------------------------------------------------------------------------
John Jumbo build detected.
Running JTRTS in -restore mode (against a 2000 candidate bitcoin input file -rules=appendNumNum)
** NOTE: may take a couple minutes to run
Done with run
Results (should be 2000) : 2000
Running JTRTS in -restore mode (against a 2000 candidate bitcoin input file -mask=?w?d?d)
** NOTE: may take a couple minutes to run
Done with run
Results (should be 2000) : 1154
$ ./jtrts.pl -restore
-------------------------------------------------------------------------------
- JtR-TestSuite (jtrts). Version 1.13, Dec 21, 2014. By, Jim Fougeron & others
- Testing: John the Ripper 1.8.0.6-jumbo-1-1562-ga7456a8+ OMP [cygwin 64-bit AVX-ac]
--------------------------------------------------------------------------------
John Jumbo build detected.
Running JTRTS in -restore mode (against a 2000 candidate bitcoin input file )
** NOTE: may take a couple minutes to run
Done with run. Results (should be 2000) : 2000 PASS
Running JTRTS in -restore mode (against a 2000 candidate bitcoin input file -rules=appendNumNum)
** NOTE: may take a couple minutes to run
Done with run. Results (should be 2000) : 2000 PASS
Running JTRTS in -restore mode (against a 2000 candidate bitcoin input file -mask=?w?d?d)
** NOTE: may take a couple minutes to run
Done with run. Results (should be 2000) : 1268 FAIL!
As we can see, the count for the mask resume fluctuates. That is exactly like we would expect.
546139a
This now has 3 restore types> NOTE the wordlist+mask shows problems in JtR.
:+1: perfect!
I have several other 'modes' done, I just have to make sure I have them right.
I have single, pure-mask, pure-rexgen and hybrid-rexgen 'done'. I only had to make one other dictionary (for single). In that dict, I kicked the iteration count up much higher for bitcoin. I did re-create the base dictionary file. The dictionary is now:
111110 ..... 111129 (20 numbers). And I use -rules=appendnumnum.
So it is easy to use pure mask, and pure rexgen, along with pure single will find them all (the prior dictionary had a ; and a space char in a couple words, and thus single was not as friendly).
Single mode is 2k hashes, with 177864 iterations (bitcoin). It is much slower, but I think the timing is about right. It might be a 'bit' slow, but that should not really hurt anything. This is just a testing tool. Once we get all the various tests in, and running fine, then we should not have to perform these checks very often. Probably would be GOOD to check before each real release.
I wouldn't be surprised if we found some more needs for fixes, minor or not. These new tests are very good Q.C.
Nice thing, is that they are somewhat static. We get them fixed, and there will infrequelty be new test types added, and the amount of code change really is not that often.
Does -stoponerror
work for this? When debugging I'm pretty sure I'll need to see exactly what was missed, and/or run the same tests manually.
each test will abort if the proper results are not found.
NOTE, for now, use -v The timings are static, and may not be right for specific CPU and or -fork and or OMP settings -v lets you see things as they chip away, so you can make sure the thing is still finding records. If the resume goes backwards TOO far, then it will get stuck, not being able to work through a single buffer/block in the time specified. The bitcoin is pretty minimal. It is OMP_SCALE=1 so it deals with 'few' candidates.
This is the 'outer' function:
sub doRestoreMode {
if ($core_only == 1) {
ScreenOut("John CORE build detected.\n The -max-run-time mode ONLY works for jumbo build of john.\n");
exit 1;
}
`rm -f tst-*`;
# grow the pw-new.dic file:
my $cmd = "$JOHN_EXE -rules=appendNumNum --stdout --w=bitcoin_restart_rules_tst.dic > tst-pw-new.dic 2>/dev/null";
$cmd = `$cmd`;
doOneRestore("-w=tst-pw-new.dic", "bitcoin_restart_rules_tst.in", 2000, 20, "bitcoin", "");
unlink("tst-pw-new.dic");
# now test wordlist + rules.
doOneRestore("-w=bitcoin_restart_rules_tst.dic", "bitcoin_restart_rules_tst.in", 2000, 20, "bitcoin", "-rules=appendNumNum");
# now test wordlist + mask.
doOneRestore("-w=bitcoin_restart_rules_tst.dic", "bitcoin_restart_rules_tst.in", 2000, 20, "bitcoin", "-mask=?w?d?d");
# now test single mode.
doOneRestore("", "bitcoin_restart_single_tst.in", 2000, 20, "bitcoin", "");
# now test pure mode.
doOneRestore("", "bitcoin_restart_rules_tst.in", 1000, 20, "bitcoin", "-mask=11111?d?d?d");
doOneRestore("", "bitcoin_restart_rules_tst.in", 1000, 20, "bitcoin", "-mask=11112?d?d?d");
# now test pure rexgen mode.
doOneRestore("", "bitcoin_restart_rules_tst.in", 2000, 20, "bitcoin", "-rexgen=1111[1-2][0-9][0-9][0-9]");
# now test wordlist + rexgen mode.
doOneRestore("-w=bitcoin_restart_rules_tst.dic", "bitcoin_restart_rules_tst.in", 2000, 20, "bitcoin", "-rexgen=/0[0-9][0-9]");
exit 0;
}
And here is the worker function that does each test.
sub doOneRestore {
my ($dic, $hashes, $cnt, $runtime, $form, $exargs) = @_;
ScreenOutSemi("\nRunning JTRTS in -restore mode (against a $cnt candidate $form input file $exargs)\n ** NOTE: may take a couple minutes to run\n");
my $cmd = "$JOHN_EXE -ses=tst- $pass_thru $exargs $dic $hashes -pot=tst-.pot -max-run=$runtime -form=$form 2>&1";
ScreenOutV("Running 1st retore command, command line\n\n$cmd\n\n");
my $results = `$cmd`;
my $ret = $?;
ScreenOutVV("Results of this run are: $results\n return code [".($ret>>8)."]\n\n");
if ($verbosity > 2) { show_eta($results); }
$cmd = "$JOHN_EXE -res=tst- 2>&1";
while ( ($ret>>8) == 1) {
ScreenOutV("Running continuing retore commands. Cmd line: $cmd\n");
$results = `$cmd`;
$ret = $?;
`stty echo >/dev/null 2>/dev/null`;
ScreenOutVV("Results of this run are: $results\n return code [".($ret>>8)."]\n\n");
if ($verbosity > 2) { show_eta($results); }
}
# now compute if we got them all.
$results = `LC_ALL='C' sort tst-.pot | LC_ALL='C' uniq | LC_ALL='C' wc -l`;
chomp $results;
print ("Done with run. Results (should be $cnt) : $results ");
cleanup();
if ($results != $cnt) {
print "FAIL!\n";
exit 1;
}
if ($error_cnt+$error_cnt_pot+$ret_val_non_zero_cnt != 0) {
print "Some error status.\n";
exit 1;
}
print "PASS\n";
`rm -f tst-*`;
}
s/retore/restore/g
Damn nit picker ;) That was simply an easter-egg for -v mode, lol
wordlist and wordlist+rules passed just fine here. And hybrid mask failed. Will look into it. The fact we can separate the basewords from the digits should mean I can see pretty much in what way it fails.
Oh, but it deleted the pot file...
I thopught that happened AFTER the exit. I will look.
ahh, there is a call to 'cleanup' in there.
I'll run it manually with a -max-run that does it all in just one resume. Should be pretty clear then.
-single mode seems to work. -pure mask seems to work. (only first 1000 checked so far, however).
I will have to let you get rexgen working. I no longer spend time to get that boat anchor code working. I will check in what I have. I think it is ready to roll.
Please see if rexgen is setup right. NOTE, you will have to comment out the mask test, to be able to continue the testing (or we could move that test to the end, knowing it fails).
bd0ac12
b7a79ed puts mask+wordlist to the last test.
Hmm the "single" test is actually running batch mode. We'll see how it goes. I assume it should explicitly run -single
, right?
Ah, but -single is the first mode of batch mode. This will work fine except if it actually does miss anything. If it does, it will continue with wordlist and then incremental, and "never" end, lol.
Or are you limiting the number of resumes? Doesn't look so in the code above.
No, there is no 'limit'. It runs until the return says jtr is done.
So in this case, the -single SHOULD be added. BUT if there are any missed, then it still is a problem, and will run a LONG time.
Fortunately, single 'seems' to resume properly.
Luckily single/batch did pass.
Not sure why you do two pure mask runs. It could be one if using -mask=1111[12]?d?d?d
And oops the rexgen test went into an infinite loop! I need to quit now, will look at this tomorrow.
Made some fixes. Two main problems: It's --regex and not --rexgen. Also, pure rexgen actually doesn't support resuming (it does warn about it) so that is commented out from TS now. But wordlist+rexgen should work (only saving parent mode's state). It doesn't work very good though - I think the problems is it doesn't have enough time to get ahead so it's stuck at the initial 10%. If that's it we could simply bump --max-run-time for that test to 20s but I'm ignoring this problem for now and will try to fix hybrid mask.
We're not looking for "capability rexgen" are we? We need to add that so we don't try to run regex mode when there is none.
You' need a special mode to test --regex, so that just a few combinations per candidate will be generated. Otherwise, you'll never proceed to the next block of input words. Plus, keep the run time slow. (For very fast hashes, --max-run-time just 1 or 2, otherwise, the memleak issues will cause trouble.
Idea: run a test with --max-run-time=1, restore repeatedly until finished, then check if all has been cracked as expected. I think we need just 2 tests (may be one saltless plus one fast salted format).
Or may be some more, for standalone
--mask
and for--wordlist [--rules] --mask
.(The
--fork
can be added as -passthru.)