disco-lang / disco

Functional teaching language for use in a discrete mathematics course
Other
158 stars 22 forks source link

Separate parser into lexer + parser #101

Open byorgey opened 6 years ago

byorgey commented 6 years ago

Currently, our parser directly operates on a stream of characters, i.e. tokenizing is built into the parsing. The idea would be to make our own token type, write a lexer that transforms a character stream into a token stream, then modify the parser to parse the token stream. This should give us better error messages (and a speedup?). Jasper van der Jeugt suggests this in https://skillsmatter.com/skillscasts/9879-an-informal-guide-to-better-compiler-errors-jasper-van-der-jeugt .

byorgey commented 2 years ago

Here's a nice example of separating out lexing and parsing phases using megaparsec: https://gist.github.com/LightAndLight/36926a4e3a7133910d9aa199da50c4fd