Open yymmsong opened 4 years ago
Hi! Thanks for the comment, it's great to see that this code is useful and used!
Would you be able to do a pull request for this?
I wonder if you could also check whether the same problem happens in LazyBackoff...
Hi, thanks for your reply. Your code is really helpful for new guys in password security like me.
I just looked into LazyBackoff code and noticed similar (but not excatly the same) problem. In LazyBackoff, it seems like probabilities coming from the same state will not always add up to 1, because parameter \alpha in https://en.wikipedia.org/wiki/Katz%27s_back-off_model is not properly set (missing denominator). I'll re-check whether this is the case and see if I can fix the problem.
Thanks for checking, please let me know what you find out! Honestly, it's code I haven't touched for a long time so it's not trivial to remember what it does :)
BTW, I'm curious about what you're working on--if you can share that. And of course, please let me know if/when you have something that's publicly available!
I'm working on so called "honey objects" like honeywords, etc. but recently as a side project I'm investigating how different n-gram smoothing algorithms would impact the performance of password cracking models. I'll make my work public soon as i finished it.
Hi, I just modified BackoffModel and LazyBackoff to match Katz's and Ma's definition, but as it's unclear whether the modification improves or degrades performance (it's probable that the implementation here is better than Ma's, because interpolation is generally better than pure backoff), maybe it's better to leave the code this way until further evalutaion. You can check or merge the change in my forked repo.
Also, results given by BackoffModel and LazyBackoff in the original implementation seem to differ, maybe because of the subtle differences in smoothing.
The backoff implementation here seems a bit inconsistent with Katz's and Ma et al's backoff models. Line 114 in backoff.py:
This statement adds some lower-order probability to every transition probability, but Katz's backoff model would only back off to lower order grams if not enough history of current gram is observed. Something like
would be more reasonable here, I guess.