shnewto / bnf

Parse BNF grammar definitions
MIT License
256 stars 22 forks source link

WIP: addding EBNF #54

Closed SKalt closed 4 years ago

SKalt commented 4 years ago

Work on #17. This is a place to show how my work's progressing.

prbs23 commented 4 years ago

This is great! I'm glad to see someone else looking at this area. I had started working on this problem as well, but only got as far as implementing the parsers. Integrating some of the ebnf specific features (repetitions, character sets, etc.) into the existing data structures got a little tricky and I ran out of free time to work on it. If you are interested, you can check out the work I had done so far in this fork: https://github.com/prbs23/bnf/tree/ebnf_parser

SKalt commented 4 years ago

Hey @prbs23, I'm interested in picking up your work again. Do you remember how you were thinking of lotting your newer EBNF architecture into the existing data structures?

prbs23 commented 4 years ago

Hey @SKalt, it's been a while since I thought about this, let's see how much I remember. In the ebnf_parser branch of my fork (linked above) inside ebnf_w3c_parsers.rs I have defined the local copies of all the data structures and internal fields that I thought were needed to represent an ebnf grammar, essentially the same Grammar, Production, Expression, and Term structures that are currently used. I think the structures just need to be expanded/modified some to handle additional types of expressions.

The biggest change is for ebnf there's four different types of expressions, terms, difference, sequence, and choice, instead of just being a list of terms. I was thinking I would convert the existing Expression struct into an enum to capture these four different types of expressions. The other difference is is that ebnf has a new type of "Term", which is a character class. This isn't so big of a change from the current Term structure, since it's already an enum that can represent a Terminal or NonTerminal term value. I think the Grammar and Production structures can remain essentially the same. The only difference I think that was needed was that the rhs value of the Production can now just be a single Expression enum instead of a list of Expressions.

I never got far enough into actually making these changes to know what kinds unforeseen issues there would be with this proposal. I'm sure there's something I'm overlooking since I'm not that familiar with this code base, but I think it's a reasonable starting point.

shnewto commented 4 years ago

👋 Hey @SKalt (and @prbs23 since you're here too) I need to revise some of this repo's early history and wanted to give you both a heads up so you aren't surprised by the wild complaints it'll cause the next time you try to sync up. I'd say minimally, make a back up before trying to pull anything in 🙃 And let me know if you have any issues, I'm happy to try and help out.

shnewto commented 4 years ago

@SKalt looks me kicking off the history revisions caused this to close automatically 🤔 Feel free to open it up again or make another any time.