nbutton23 / zxcvbn-go

zxcvbn password complexity algorithm in golang
MIT License
379 stars 49 forks source link

changing l33t substitution set of possibilities #28

Closed edwincarlo closed 6 years ago

edwincarlo commented 6 years ago

Hello @nbutton23 ,

There is a small difference on zxcvbn-go and python-zxcvbn related to the recursiveness issue. This difference is probably about the understanding of what "all" means when generating the list of permutations (returned by getAllPermutationsOfLeetSubstitutions function).

On zxcvbn presentation and also on paper, the author exposed the example like in the following excerpt:

"[...] Additionally, it attempts each possible l33t substitution according to a table. An input @BA1one is first lowercased to @ba1one. If the 133t table maps @ to a and 1 to either i or l, it tries two additional matches by subbing [@->a, 1->i] and [@->a, 1->l], finding abalone with the second substitution. [...]"

But there was no details or reference to the procedure adopted to create "each possible l33t substitution".

The point is that while zxcvbn-go is really doing a complete list of every possible permutation, python-zxcvbn is only considering one human readable character to replace one l33t character on a given word per time. As an example, the value "1|1|" in golang version provides 2^n possibilities of substitution (where n is the number of chars in the word following this pattern) while the python version provides only 6 possible substitutions for a word that contains only "1" and "|" independent of the word length ("ilil", "lili", "i|i|", "l|l|", "1i1i", and "1l1l"). Even that the approach addopted by golang version includes a more complete list, it leads to the problem we are facing with recursiveness according to user input.

It does not means that python version of zxcvbn does not have problems to deal with large inputs. Python and also javascript versions of zxcvbn get in trouble while dealing with large inputs like "4@8({[<3691!|1|70$5+7%24@8({[<364@8({[<3691!|1|70$5+7%24@8({[<364@8({[<3691!|1|70$5+7%24@8({[<36".

To solve these problems, we have made some changes on code. A list of substitutions map is created for each possibility after relevant map creation, similar to python version. The difference is that it is possible to create the substitution table without look to user input because we already have the relevant map, making the algorithm execution as complex as the map stored on data. These changes made the zxcvbn-go a little more robust than python version in this specific point, while it is still providing the same substitution list.

The solution was a little bit more complex than the one used on python, but it was planned to has no recursiveness at all. Unit tests were created to help on understanding and also to check the scenarios. These tests also included some scenarios provided on related issues (making sure that the problem was solved for these inputs). 100% of the changes are unit tested as it is possible to see in code coverage. Code was also documented to explain the steps being performed.

It is still not considering extra stop criteria while the possibilities of permutations are increasing. But the consumption of computational resources would only change if the L33t.json stored on data is changed. This makes it safe to variations of user input but not for mistakes on development side. Otherwise, it is possible to rely a little bit on unit tests, that are considering some adverse scenarios.

It would be great to have it included on main code.

Please consider review the changes and give us feedback if some adjustment is necessary.