Some parts of the parser state machine are a bit hairy; there is a lot of C&P snippets, and the places where we detect syntax error are scattered everywhere.
Here's an experiment:
add a dead simple lexer based on shlex
parse the input line to tokens and use pattern matching to separate valid and invalid syntax
side effect: for errors, we'll now also get the position in the line (for better error messages)
side effect: we can more easily detect invalid syntax
Some parts of the parser state machine are a bit hairy; there is a lot of C&P snippets, and the places where we detect syntax error are scattered everywhere.
Here's an experiment: