blynn / nex

Lexer for Go
http://cs.stanford.edu/~blynn/nex/
GNU General Public License v3.0
416 stars 47 forks source link

Difficulty with matching a "string" #44

Closed purpleidea closed 7 years ago

purpleidea commented 7 years ago

Writing a little programming language, and trying to lex a "string":

Eg:

$foo = "hello, world"

This is kind of tricky, in particular in the docs:

**Matching Nuances

Among rules in the same scope, the longest matching pattern takes precedence. In event of a tie, the first pattern wins.**

This means that if I have other similar patterns in the code, this will match when I don't expect it. Additionally, is there some way to specify a string? I don't want to write in every unicode char, and if you do a \*\ match it matches too much!!

purpleidea commented 7 years ago

FYI at the moment I've got an incomplete:

/"[\a\b\t\n\v\f\r !#$%&'()*+,-.\/0-9:;<=>?@A-Z\[\\\]^_a-z{|}~]*"/   {
                // TODO: we need to match more chars for strings!
                s := yylex.Text()
                lval.str = s[1:len(s)-1] // remove the two quotes
                return STRING
            }
purpleidea commented 7 years ago

Solved:

/"(\\.|[^"])*"/

Using strconv.Unquote.

Full code to appear in https://github.com/purpleidea/mgmt/ as soon as I'm finished polishing the lang stuff.

HTH