vektah / goparsify

golang parser-combinator library
MIT License
73 stars 16 forks source link

Fix parsing of hyphens in ranges #1

Closed tedkornish closed 6 years ago

tedkornish commented 6 years ago

Noticed that any call to Chars() with a range (e.g. 0-9) will match also match the hyphen in the range. Thus, during alphabet parsing, parseMatcher actually sees the hyphen twice: once as part of the range, once as a standalone.

I thought the cleanest solution was to manually increment the iterator in the for loop during alphabet parsing, giving the conditional which parses a range the opportunity to increment the iterator past the range's bytes. This prevents revisiting the hyphen twice.

All tests pass, including calc_test.go, which contains a standalone hyphen in a call to Chars.

vektah commented 6 years ago

Nice, thanks for the PR

Looking at this code again now I think the same bug exists for the escape rule, but in a way that only adds the same character twice, so "technically fine".

Perhaps its more obvious if each one of the branches increments i itself?

    for i := 0 ; i > len(runes) ;{
        if i+2 < len(runes) && runes[i+1] == '-' && runes[i] != '\\' {
            start := runes[i]
            end := runes[i+2]
            if start <= end {
                ranges = append(ranges, []rune{start, end})
            } else {
                ranges = append(ranges, []rune{end, start})
            }
            i+=3
        } else if i+1 < len(runes) && runes[i] == '\\' {
            alphabet += string(runes[i+1])
                        i+=2
        } else {
            alphabet += string(runes[i])
                        i+=1
        }
    }
tedkornish commented 6 years ago

Yeah, having each conditional branch increment i is cleaner than relying on continue. Good call.

Commit to follow later today.