update Lexer to use Errors and add some support for C23 stuffs

NiumXp commented 8 months ago

On progress

[x] Add to support others prefixes for chars/strings like u and u8
[ ] ~~(Maybe) Add to support ' (sigle quote) to separate integers digits~~
[x] Add extensive tests about new lexer methods
- [x] Add preprocessors tokens tests
- [x] Add float (hexadecimal, etc.) tests
[x] Remove or replace cast calls to # type: ignore
[x] Remove assert statements

Errors instead of Exceptions

I removed all exceptions from Lexer (I'm yet adding tests to make it no error-prone) and added Errors instead, it will show more user-friendly error messages to users (and highlights soon). We don't need to stop the norminette when a float is correct in syntax but not in semantics (I added 8 errors about floats/integers).

Alternative lexemes

Norminette will do some translations like <% and ??< to LBRACE ({). Note that strings with digraphs and trigraphs will be translated as well like #486 did in comments.

Escaped newline

This PRs is removing the <ESCAPED_NEWLINE> token, it means that codes using it will be buggy, like CheckLineCount (I pretend to fix it with others rules in a separated PR).

Chars as Strings, Escape sequences

If the user writes a string with ' (single quote) like 'hello', it will see an error noticing it. Another cool thing is about writing bad escape sequences like \m, \xGG, etc. that will shows an error.

Suffix in numeric literals

Added i64 (Microsoft-specific), wb (_BigInt(N)) and z (size_t) suffixes for integer literals and for float literals: [d*]() (_DecimalN) and i with j for complex literals (_Complex).

Others

Moved tests related to lexer to test_lexer.py
Removed the TokenError exception completly

matthieu42Network commented 7 months ago

Hello, I didn't review because of the 6/7 validated but is it over?

NiumXp commented 7 months ago

Hello, I didn't review because of the 6/7 validated but is it over?

Yes, it is

42School / norminette