haskell-suite / haskell-src-exts

Manipulating Haskell source: abstract syntax, lexer, parser, and pretty-printer
Other
193 stars 94 forks source link

fix unicode identifier parsing #442

Closed obfusk closed 4 years ago

obfusk commented 4 years ago

Correctly parse identifier that starts with a letter that is neither upper nor lower case (and should count as the latter); e.g. .

I'm guessing this needs to be fixed in more places; I'd be happy to look into that sometime later, but I'm not very familiar with the codebase and would appreciate some help. I just found this specific bug when fixing a bug in hoogle.

I also did not add any extra tests; I can also do that later if you'd like.

obfusk commented 4 years ago

I suspect several other uses of isLower c should be isLetter c && not (isUpper c).

DanBurton commented 4 years ago

Instead of merging this, I've added a commit to handle it more thoroughly: https://github.com/haskell-suite/haskell-src-exts/commit/c9f786d70d5a5c78f66784b681bf7a2c5d1c1fa0

And a test: https://github.com/haskell-suite/haskell-src-exts/commit/94a3bcde453910f8133ad1a01e9e25c77ec64372

There's still something weird about prettyprinting these kinds of identifiers, though. See also: https://github.com/haskell-suite/haskell-src-exts/issues/443

obfusk commented 4 years ago

Great! Thanks for saving me the extra work to handle it more thoroughly :)