Closed frank-dittrich closed 9 years ago
When I change Save = 60
to Save = 62
in john.conf, the error occurs much later:
606 0 1499
621 0 1499
626 0 1499
675 0 1499
688 0 1499
690 0 1499
692 0 1499
698 0 1499
706 0 1499
707 0 1499
708 0 1497
716 0 1499
718 0 1499
723 0 1499
726
Still no indication of an error in *.log stderr-*.txt stdout-*.txt
.
Can you try schedtool -a 0x1 -e ./john ...
instead of ./john
?
I do not have anything named schedtool.
I can't see why the Save timer would affect this. No session runs for that long anyway, right?
I can't see why the Save timer would affect this
OK, it's because OS_TIMER counts backwards.
With Save = 2
I get
$ (for i in `seq 2 1024`; do echo -n -e "$i\t"; echo $i > i.txt; /bin/rm john.pot; ./john --wordlist=../test/pw.dic ../test/LM_tst.in --format=lm --fork=${i} --session=t-$i > stdout-$i.txt 2> stderr-$i.txt ; echo -n -e "$?\t"; LC_ALL=C sort -u john.pot | wc -l; done | LC_ALL=C grep -v 1500 ); echo $?
268 0 1499
282 0 1499
293 0 1499
304 0 1499
318 0 1499
319 0 1499
325 0 1499
332 0 1499
333 0 1499
335 0 1497
340 0 1497
...
957 0 1499
958 0 1499
959 0 1499
960 0 1498
961 0 1496
bash: fork: retry: Resource temporarily unavailable
962 1 79
963 1 23
964 1 2
965 1 97
966 1 2
bash: fork: retry: Resource temporarily unavailable
967 1 0
968 1 0
969 1 41
970 1 7
971 1 0
972 1 0
973 1 0
974 1 4
975 1 0
976 1 5
977 1 29
978 1 0
979 1 0
bash: fork: retry: Resource temporarily unavailable
980 1 0
981 1 53
bash: fork: retry: Resource temporarily unavailable
982 1 12
983 1 25
984 1 31
985 1 0
986 1 0
987 1 0
988 1 0
989 1 7
bash: fork: retry: Resource temporarily unavailable
990 1 2
991 1 0
992 1 2
993 1 32
994 1 0
995 1 41
996 1 5
997 1 0
998 1 9
999 1 25
1000 1 15
1001 1 5
1002 1 33
1003 1 0
bash: fork: retry: Resource temporarily unavailable
1004 1 3
1005 1 13
1006 1 2
1007 1 5
1008 1 0
1009 1 2
1010 1 0
1011 1 28
1012 1 44
1013 1 18
1014 1 4
bash: fork: retry: Resource temporarily unavailable
1015 1 2
1016 1 10
1017 1 7
1018 1 2
1019 1 0
1020 1 25
1021 1 9
1022 1 19
1023 1 9
1024 1 1
0
$ grep -i error t-96[12].log
t-962.log:1 0:00:00:00 Terminating on error, john.c:489
On 11/09/2014 11:38 PM, magnum wrote:
Trying the same on OSX, it works fine up to 572, and from that point the problem can be seen in the stderr file: "fork: Resource temporarily unavailable". So I can't reproduce any problem with John.
I bet with less than 572 forks, john finishes in less than 1 second on you OSX system, so you don't get the signals when 59 seconds of the 60 are left.
On my 64bit Linux system with an Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz (quad core, no hyperthreading) I manage to get the john killed with SUGUSR2 at 761 forked processes (with Idle=Y in john.conf, when I run 4 other john processes (with Idle = N in john.conf) trying (not) to crack the rar test hashes: ./john --incremental --min-length=6 hashes.rar --session=rar4
...
30670 fd 20 0 542700 48620 9224 R 92.8 0.6 18:16.65 ./john --session=rar1 --incremental --min-length=6 hashes.rar
27686 fd 20 0 542704 45892 9188 R 90.8 0.6 14:33.13 ./john --incremental --min-length=6 hashes.rar --session=rar5
20452 fd 20 0 542700 48624 9216 R 85.8 0.6 20:27.23 ./john --incremental --min-length=6 hashes.rar --session=rar3
20484 fd 20 0 542700 48592 9188 R 85.8 0.6 20:00.28 ./john --incremental --min-length=6 hashes.rar --session=rar4
10178 fd 20 0 93968 10644 6844 R 4.0 0.1 0:00.12 ./john --wordlist=../test/pw.dic ../test/LM_tst.in --format=lm --fork=510 --session=t-510
And some tests with a smaller number of forked processes exited with $? = 0, but less than 1500 unique hashes cracked.
$ (for i in `seq 500 1024`; do echo -n -e "$i\t"; echo $i > i.txt; /bin/rm john.pot; ./john --wordlist=../test/pw.dic ../test/LM_tst.in --format=lm --fork=${i} --session=t-$i > stdout-$i.txt 2> stderr-$i.txt ; echo -n -e "$?\t"; LC_ALL=C sort -u john.pot | wc -l; done | LC_ALL=C grep -v 1500 ); echo $?
631 0 1499
726 0 1499
737 0 1499
743 0 1499
755 0 1498
761
0
As root
, with ulimits -u
= 7794 (compared to 1024 as a regular user), nothing changes.
# (for i in `seq 2 1024`; do echo -n -e "$i\t"; echo $i > i.txt; /bin/rm john.pot; ./john --wordlist=../test/pw.dic ../test/LM_tst.in --format=lm --fork=${i} --session=t-$i > stdout-$i.txt 2> stderr-$i.txt ; echo -n -e "$?\t"; LC_ALL=C sort -u john.pot | wc -l; done | LC_ALL=C grep -v 1500 ); echo $?
/bin/rm: cannot remove ‘john.pot’: No such file or directory
350 0 1499
355 0 1499
356 0 1499
358 0 1499
373 0 1499
374 0 1499
380 0 1497
389 0 1498
391 0 1499
394
0
# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7794
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 7794
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
With some more changes in ulimits
# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15588
max locked memory (kbytes, -l) 256
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 1638400
real-time priority (-r) 0
stack size (kbytes, -s) 16384
cpu time (seconds, -t) unlimited
max user processes (-u) 7794
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I get
# (for i in `seq 2 1024`; do echo -n -e "$i\t"; echo $i > i.txt; /bin/rm john.pot; ./john --wordlist=../test/pw.dic ../test/LM_tst.in --format=lm --fork=${i} --session=t-$i > stdout-$i.txt 2> stderr-$i.txt ; echo -n -e "$?\t"; LC_ALL=C sort -u john.pot | wc -l; done | LC_ALL=C grep -v 1500 ); echo $?
/bin/rm: cannot remove ‘john.pot’: No such file or directory
202 0 1499
207 0 1499
238 0 1498
266 0 1499
268 0 1499
271
0
So, this is most likely not related to ulimit.
I bet https://github.com/magnumripper/JohnTheRipper/issues/798#issuecomment-62325144 would make it much better, (at least provided you don't use a Save interval that is a multiple of 3).
I bet #798 (comment) would make it much better, (at least provided you don't use a Save interval that is a multiple of 3).
I tested with the default Save interval of 60, which is a multiple of 3. Why do you think that matters here? All these test runs are much faster than 60 seconds, even on my old 32bit system.
(for i in `seq 300 1024`; do echo -n -e "$i\t"; echo $i > i.txt; /bin/rm john.pot; ./john --wordlist=../test/pw.dic ../test/LM_tst.in --format=lm --fork=${i} --session=t-$i > stdout-$i.txt 2> stderr-$i.txt ; echo -n -e "$?\t"; LC_ALL=C sort -u john.pot | wc -l; done |grep -v 1500 ); echo $?
rm: cannot remove ‘t-*.rec’: No such file or directory
603 0 1499
608 0 1499
663 0 1499
675 0 1499
696 0 1499
700 0 1497
711 0 1498
714 0 1498
717
0
No indication of problems in log file, stderr or stdout output.
So, the improvement is similar to what I got when I use Save = 60
without that patch.
BTW, the --fork=717
test that got killed produced a john.pot file with 1500 unique hashes.
I tested with the default Save interval of 60, which is a multiple of 3. Why do you think that matters here? All these test runs are much faster than 60 seconds, even on my old 32bit system.
Because, like you found out, without that patch and with OS_TIMER, some things would happen after 0 seconds instead of after three seconds. Maybe that is unrelated.
Because, like you found out, without that patch and with OS_TIMER, some things would happen after 0 seconds instead of after three seconds. Maybe that is unrelated.
Yes, but that was without the patch.
With Save =60
, after one second, you'd get the SIGUSR2 signals, because 59 & 3 == 3
.
With your patch, you'd get the SIGUSR2 signals 2 seconds later, no matter if you have Save = 60
or Save = 62
.
(60 - 57) & 3 == 3
and (62 - 59) & 3 == 3
.
I had the vague idea that under this crazy over-booking of resources there could be a difference. But now that I think about it, we should already be "protected" against USR2 anyway: The real key is calling sig_init() and sig_init_child() early enough and this doesn't change that. I can't think of any way to do those earlier than we do now iirc.
I can't think of any way to do those earlier than we do now iirc.
On the other hand, if we indeed introduce yet another counter instead of using ((timer_save_interval - timer_save_value) & 3) == 3)
, we could also opt to never ever send a USR2 during the first minute, or whatever we choose.
Anyway we put it, I think this issue and #798 are purely academical. If we can find solutions for them, fine. If we can't, I will not lose any sleep over it.
I'm closing this. If some similar problem can still be triggered, please open a brand new issue. This issue is too clobbered anyway.
This is with latest jtrts commit (19ca304106ebebb5d0d9b717adc2cb4626cc9808) and latest bleeding-jumbo commit (445440621046cfb3f0d7efe3fb1e062385f6ffb4)