haskell / alex

A lexical analyser generator for Haskell
https://hackage.haskell.org/package/alex
BSD 3-Clause "New" or "Revised" License
297 stars 82 forks source link

token length incorrect using --latin1 option #63

Closed chrismshelton closed 9 years ago

chrismshelton commented 9 years ago

The template files assume UTF-8 in the alex_scan_tkn function, even when the --latin1 option is given.

In templates/GenericTemplate.hs:

alex_scan_tkn user orig_input (if c < 0x80 || c >= 0xC0 then PLUS(len,ILIT(1)) else len)

This leads to an incorrect length being given to the lexer actions for tokens containing bytes between 0x80 and 0xC0

hvr commented 9 years ago

do you happen to have a small test-case for this?

chrismshelton commented 9 years ago

Sorry, my bad, now that I'm trying to come up with a test case I can't reproduce the error