lakiw / pcfg_cracker

Probabilistic Context Free Grammar (PCFG) password guess generator
314 stars 68 forks source link

--coverage option has opposite explanation(in cmd, in github) #24

Closed gtgtjune closed 2 years ago

gtgtjune commented 2 years ago

Hi I think --coverage option has opposite explanation(in cmd, in github)

// If you set coverage to 1, no brute force will be performed. If you set coverage to 0, it will only generate guesses using Markov attacks. //

# Add in the probability of brute force to the base structures if program_info['coverage'] != 0:

Make sure there are valid OMEN parses, otherwise no sense creating

    # a brute force rule
lakiw commented 2 years ago

Hi, thank you very much! Beyond the fact that pointing out confusing and incorrect parts in my documentation is really helpful since the whole point is to explain the functionality to others, you highlighted a bug/shortcut I made during my initial adding of OMEN into that A) I forgot about, and B) I really should fix! I'm playing around with some options right now and once I have something working I'll push it into the "code_cleanup" branch I'm working on right now.

Ideally, I'd like to devote the percentage of guesses to OMEN based on (1 - coverage) * 100%, as the help text suggested. But as you pointed out I had thrown the divide by 0 check in there to keep it from crashing, but in that case it wouldn't generate OMEN guesses at all, (which is a bug). Right now there is also a shortcut since I didn't want to recalculate the base structure probabilities for everything else. So I need to be less lazy and actually implement it correctly.

Thanks again!

lakiw commented 2 years ago

I just pushed a fix to the "code_cleanup" branch and made sure to credit you in the commit comments. I'm still testing the results, so I'm going to leave this issue open for a bit. Besides fixing the internal logic, I modified the help text to read:

--coverage COVERAGE, -c COVERAGE

The coverage you expect the training set to have when cracking passwords. What this really means is how many guesses should be generated from strings found in the training set, and how many guesses should be generated by Brute-Force/Markov/OMEN. A higher coverage means less guesses generated by fall back options like Markov. Roughly coverage translates to the percentage of guesses to generate using strings found in the training set, so a coverage of 1.0 means do not generate Brute-Force/Markov/OMEN guesses, and a coverage of 0.0 means ONLY generate Brute- Force/Makov/OMEN guesses. A coverage of 0.5 would mean splitting the guesses between them 50/50. Range: Between 1.0 and 0.0. Default: 0.6
gtgtjune commented 2 years ago

Thank you for rapid reply. Changed is clear.

lakiw commented 2 years ago

Finally released v4.3 with this change. Closing this issue. Thank you once again for raising it!