lakiw / pcfg_cracker

Probabilistic Context Free Grammar (PCFG) password guess generator
314 stars 68 forks source link

Actually implement probability smoothing #21

Open lakiw opened 3 years ago

lakiw commented 3 years ago

There is a lot of placeholders currently in the trainer where probability smoothing can be applied, but currently that functionality is not being used.

For example in calculate_probabilities.py:

This is a way to smooth probabilities between different items to help
the actual password cracker making use of them.

For example, if 'chair' was seen 100000 times, and 'table' was seen 100001
times, it would be nice to treat them as the same probability to reduce
the amount of work the pcfg cracker needs to perform
Currently this is a no-op / placeholder and does not actually do anything

Opening this issue to remind myself to tackle this optimization at some point in the future.