Closed fluffle closed 9 years ago
[20:36]
So, yeah. 2nd October 10am breaks down into the following tokens:
2 T_INTEGER "nd" T_DAYQUAL "October" T_MONTHNAME 10 T_INTEGER "A" "A" "M" "M" // M is so ambiguous I have to special case it basically everywhere.
This means the leftmost-longest match is (2) not (1), which then fails because it leaves just "AM" left on the token stack and that can't be parsed correctly. Putting the "at" in there causes (1) to be the longest match since it inserts a T_IGNORE after the T_MONTHNAME, and thus everything parses fine.
I could make the comma non optional to disambiguate better but I think that "2nd October 2015" is likely to be more common than "2nd October 10am". Realistically, LALR parsing is bad for this kind of grammar. It's possible that someone will have built a GLR parser in Go sometime in the last 4 years or so, maybe I'll investigate sometime. Otherwise, meh. Suggestions for alternative ways to disambiguate on a postcard, please :-)
Thanks