Error handling of the lexer

damlabeyaz commented 11 years ago

This issue is especially important for the lexer and parser group, but members of other groups can also join and give ideas.

We have got the question, if the lexer shall throw lexical errors if the current input is not a valid lexeme, thus when a token cannot be returned. There are several error handling methods and the dragon book prefers the panic error recovery method, where the lexer skips a few symbols until a synchronization token (e.g. ";") is found or a new valid lexeme is found.

I think, the lexer should not throw any errors. If it cannot parse an input string to a valid lexeme, it can try with the panic recovery mode but should never care about errors. If the source code has errors and tokens cannot be extracted, this is a syntax error (e.g. if user entered ">" instead of ")", which is syntactically wrong).

What is your opinion? Should the lexer throw errors, if it cannot detects a lexeme? For example with the loc and coc, where it has found the invalid input? Or is an invalid lexeme a syntax error?

Tkrauss commented 11 years ago

That's a good question. We discussed it already later at the project meeting today ( or yesterday, ups). I think it's hard to decide what to do, but after a long discussion we agreed on the point that this is a matter of the parser. If a lexeme can't be recognized, a NOT_A_TOKEN ( like NaN with number) will be inserted in the token stream like every regular token. In this case, the parser can decide what to do. Afaik, this was our decision...

flofreud commented 11 years ago

Needed modifications for lexer/parser error handling can are inside of

9d81484a9331fea85677a779281ecfb50677b188
ba21de51f20df80ebbfa032c0569d4128fd90873
cb75865f914400b783eba6c4b0a32732f33f51de

I hope the provided specification in javadoc is sufficiated for the implementation.

swp-uebersetzerbau-ss13 / common

Error handling of the lexer #17