softdevteam / grmtools

Rust grammar tool libraries and binaries
Other
494 stars 32 forks source link

Allow `%prec` to define a token in the grammar. #453

Closed ltratt closed 2 months ago

ltratt commented 2 months ago

Previously if %prec 'x' was the first mention of the token x then cfgrammar would say "'x' isn't a token" which is clearly incorrect. This commit is maybe best thought of as a quick fix of sorts: it makes %prec 'x' define a token x if it doesn't already exist.

However -- and my memory is completely fuzzy as to whether this is a Yacc compatibility thing or not -- this means that %prec 'x' and %prec x' (i.e. with and without quote marks) have the semantic meaning becauseparse_token` treats both interchangeably. This seems consistent with other parts of the grammar even if a little odd.

I believe this fixes https://github.com/softdevteam/grmtools/issues/450.

ratmice commented 2 months ago

I should be honest and say I'm not really familiar with %prec or any of the associativity declarations, as I tend towards manually resolving ambiguity even if doing so it isn't terribly convenient.

This seems okay for a quick fix though, in that it seems low impact and isolated to %prec, which specifically does not affect parse_token which I would be more concerned about.

One other part of grmtools (albeit there isn't an equivalent in yacc) where quotes have semantic meaning and aren't interchangeable is %expect-unused, which first tries to parse it as a name then failing that tries to parse it as a token, which in that path means generally unquoted tokens will be treated as a name.

Anyhow i'm happy to approve if you think it is ready.

ltratt commented 2 months ago

One other part of grmtools (albeit there isn't an equivalent in yacc) where quotes have semantic meaning and aren't interchangeable is %expect-unused, which first tries to parse it as a name then failing that tries to parse it as a token, which in that path means generally unquoted tokens will be treated as a name.

Good point. I wonder if there should be another PR which a) checks exactly what yacc does in these cases (I vaguely remember it allows either case, but I might well be wrong) b) checks the code for precise compatibility with yacc? I'm not sure I'll get to that very soon, so at least this PR nudges us in a better-than-the-status-quo direction...

ratmice commented 2 months ago

There is a bit more about prec in https://pubs.opengroup.org/onlinepubs/7908799/xcu/yacc.html But here is the relevant bits that would affect quoting

The reserved symbol %prec can appear immediately after the body of the grammar rule and can be followed by a token name or a literal. It will cause the precedence of the grammar rule to become that of the following token name or literal.

Edit: So it definitely seems like it allows either case, but it seems pretty silent on the handling of this specific token/rule not found case.