RUB-SysSec / OMEN

OMEN: Ordered Markov ENumerator - Password Guesser
https://www.mobsec.rub.de
313 stars 43 forks source link

Issue with Length data when training #10

Open lakiw opened 6 years ago

lakiw commented 6 years ago

It appears that the length data saved in LN.level has an off by one error when training passwords of length 4 are being saved as length 5, length 5 are being saved as length 6, etc, with junk data being saved for length 4, (using ngrams = 4).

For example, consider the training set. Note this does not have "junk" data for length 4 but I've seen that appear on larger training sets like the RockYou list:

test test1 test1 test12 test12 test12 test123 test123 test123 test123

So there is 1 of length 4, 2 of length 5, etc. Using the following command for training:

./createNG -F -v -n 4 --iPwdList test.txt

The following is my LN.count file: ... 0 1 0 2 0 3 0 4 1 5 2 6 3 7 4 8 0 9 0 10 0 11 ....

lakiw commented 6 years ago

*Smacks head. Looks like the value in 4 is for length 3 passwords. For example if I modify the training set as:

tes test test1 test1 test12 test12 test12 test123 test123 test123 test123

I get the following in LN.count:

0 1 0 2 0 3 1 4 1 5 2 6 3 7 4 8 0 9