goffrie / lalr

CFGs to LALR(1) parse tables
Other
10 stars 1 forks source link

Resolve ambiguous grammar by favoring shifting #3

Closed maxrdz closed 10 months ago

maxrdz commented 11 months ago

I've been running into lots of issues writing my CFG for a language specification that i'm trying to parse using plex, which uses this crate under the hood. After reading more about LALR 'bottom-up' parsers and the shift-reduce parsing method, I realized that some parsers resolve ambiguous grammar using a simple rule: If a shift-reduce conflict is encountered, the parser favors shifting over reducing.

image

I've also found that GNU Bison uses this method in its generated parsers, which allows the developer to write some ambiguous grammar. I'm requesting this as a feature for plex; I'm aware it's not actively worked on at the moment, but I'll see if I can contribute this if the maintainers don't.

goffrie commented 11 months ago

There is a way to favour shift over reduce, but in a targeted way: https://github.com/goffrie/lalr/blob/ecf92c511de89f11b210b9b8633dbc5cc0e59799/src/lib.rs#L377-L381

In the plex macro you can put the #[no_reduce(TOKEN)] attribute on a production to force it to never reduce (i.e. never apply) when the lookahead token is TOKEN.

This can be used to solve the classical if-else nesting problem as in this (fairly recently added) example: https://github.com/goffrie/plex/blob/9a862b93252653c177b77084de1bb5de094b31dc/tests/ui/if_else_shift_reduce.rs#L34-L41

Admittedly this could be better documented, but does that give you the tools you need?

maxrdz commented 11 months ago

This looks like it would solve my issues; I'll let you know if it works out, thanks. 😊

maxrdz commented 11 months ago

This works, thank you!

maxrdz commented 10 months ago

There is a way to favour shift over reduce, but in a targeted way:

https://github.com/goffrie/lalr/blob/ecf92c511de89f11b210b9b8633dbc5cc0e59799/src/lib.rs#L377-L381

In the plex macro you can put the #[no_reduce(TOKEN)] attribute on a production to force it to never reduce (i.e. never apply) when the lookahead token is TOKEN.

This can be used to solve the classical if-else nesting problem as in this (fairly recently added) example: https://github.com/goffrie/plex/blob/9a862b93252653c177b77084de1bb5de094b31dc/tests/ui/if_else_shift_reduce.rs#L34-L41

Admittedly this could be better documented, but does that give you the tools you need?

I see there is also 'priority_of' to resolve reduce-reduce conflicts. Is this a valid attribute? I am not able to use it the same way as I use 'no_reduce'. (Also wondering if these names were changed at one point, since in lib.rs it is called 'reduce_on' instead, unless its referencing another attribute)

maxrdz commented 10 months ago

I would totally look more in depth into the source of lalr, and even consider contributing towards it, but I will be honest this is super super advanced for me. haha

maxrdz commented 10 months ago

I looked at the source for a second for plex, since that is what I'm directly using, and I can see how it forms an interface for passing down these attributes to this library. If I'm correct what I'm looking for is the #[overriding] attribute.

Will make sure to consider contributing instead if I need a more specific solution. Closing again.

goffrie commented 10 months ago

Yup, that's right. Sorry the code isn't that great, it's really old at this point..

On Sat, Dec 23, 2023, 00:19 Max Rodriguez @.***> wrote:

I looked at the source for a second for plex, since that is what I'm directly using, and I can see how it forms an interface for passing down these attributes to this library. If I'm correct what I'm looking for is the #[overriding] attribute.

Will make sure to consider contributing instead if I need a more specific solution. Closing again.

— Reply to this email directly, view it on GitHub https://github.com/goffrie/lalr/issues/3#issuecomment-1868219634, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJO2TRXC2WAR4UHLNH4EJ3YKZZWFAVCNFSM6AAAAAA7L2RJX2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGIYTSNRTGQ . You are receiving this because you commented.Message ID: @.***>