Closed stevenjohnstone closed 5 years ago
(I've merged separately your fix in matching/dictionnary.go)
Thanks for finding this explanation !
It will be much easier to only focus on making the various matches work safely on valid utf8 strings since passwords are expected to be valid strings, therefore I just made a change to avoid evaluation and report a score of 0 for passwords containing invalid utf8 runes. From a quick run of the fuzzer there are still crashes but at least those will be easier to understand / fix.
2018-03-30 23:12 GMT+02:00 Steven Johnstone notifications@github.com:
@stevenjohnstone commented on this pull request.
In matching/leet.go https://github.com/trustelem/zxcvbn/pull/3#discussion_r178381819:
@@ -52,15 +54,17 @@ func (lm l33tMatch) Matches(password string) []*match.Match { }
func translate(password string, sub map[string]string) string {
- var res string
- for _, s := range password {
- if v, ok := sub[string(s)]; ok {
- res = res + v
- } else {
- res = res + string(s)
- return strings.Map(func(r rune) rune {
A better explanation: the problems occur when you have a string which is not valid utf8. As you range over the string, the rune value returned may be 0xfffd which is encoded in 3 bytes.
See https://golang.org/ref/spec#For_statements:
For a string value, the "range" clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type rune, will be the value of the corresponding code point. If the iteration encounters an invalid UTF-8 sequence, the second value will be 0xFFFD, the Unicode replacement character, and the next iteration will advance a single byte in the string.
Therefore, for each invalid byte value which isn't valid utf8, s will be 0xfffd and len(string(s)) is 3.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/trustelem/zxcvbn/pull/3#discussion_r178381819, or mute the thread https://github.com/notifications/unsubscribe-auth/AAw3eAulwZn2RPbqabn4G1Zk9alyrEzuks5tjp_DgaJpZM4S_TqY .
I'm closing this pull request now since some parts were merged and restricting passwords to valid utf8 inputs looks to have fixed most (all ?) other cases. Feel free to reopen if you find more bugs !
Pull Request Test Coverage Report for Build 14
💛 - Coveralls