goccmack / gocc

Parser / Scanner Generator
Other
622 stars 48 forks source link

Replace hardcoded "$" EOF characters by something less general #128

Closed skius closed 1 year ago

skius commented 3 years ago

WIP

This fixes #127, however does so in an ugly manner. I just changed every "$" to the SUBstiture character "␚", in hopes that it is used by no one. A better fix would be completely getting rid of a concrete representation of EOF and instead replacing it with an abstract struct, but this would involve refactoring the lexer to use a new interface instead of strings, and I don't know enough about this project to understand what all needs to be changed.

Something else that popped up, is it possible that the gen.sh script is outdated? I tried generating a new frontend for gocc, but got a bunch of warnings and the resulting frontend doesn't work:

warning: undefined symbol "g_sdt_lit" used in productions ["FileHeader" "SyntaxBody" "SyntaxBody"]
warning: undefined symbol "regDefId" used in productions ["LexProduction" "LexTerm"]
warning: undefined symbol "char_lit" used in productions ["LexTerm" "LexTerm" "LexTerm"]
warning: undefined symbol "prodId" used in productions ["SyntaxProduction" "Symbol"]
warning: undefined symbol "string_lit" used in productions ["Symbol"]
warning: undefined symbol "tokId" used in productions ["LexProduction" "Symbol"]
warning: undefined symbol "ignoredTokId" used in productions ["LexProduction"]

This happens on the current master branch too, so I don't think it's due to my additions.

As such, I had to manually change gocc's current frontend to use SUB instead of "$".

awalterschulze commented 1 year ago

Looks good to me