shnewto / bnf

Parse BNF grammar definitions
MIT License
256 stars 22 forks source link

Reworked generate #127

Closed shnewto closed 1 year ago

shnewto commented 1 year ago

Implementation

A swing at addressing #70 though rather than getting any smarter about how we actually generate sentences (that can surely be improved someday), the changes in this PR implement checks that generation can succeed. Aside from the tests, and a couple little things cargo fmt threw in there, the actual implementation is relatively short, thanks to the proposal from @notDhruv, that served me well ❤️

That proposal was for implementation that effectively lets us decide if generation can succeed lives here https://github.com/shnewto/bnf/issues/70#issuecomment-1356299131

I was happy to follow it to the letter, though one note if you're looking at the code, in my implementation "marking" non-terminals means adding them to a Vec called terminating_rules

Heads Up

This represents a breaking change! (though a great one imo) It removes the RecursionLimit error, and previously, you could generate from grammars that didn't hit a terminal state, they'd just print what they had, including the string representation of non-terminals if that's all there was. Now, that's not the case :)

coveralls commented 1 year ago

Coverage Status

Coverage: 91.694% (+0.6%) from 91.063% when pulling 79e8abf50535c32ed56d9851f00b278bec306f91 on reworked-generate into 9db5aaafb99c7101340ebe180a7c7ef7c966068f on main.

shnewto commented 1 year ago

Another couple notes. First, this opens the door to being able to better set some "min" and "max" for generated output but that seems to me better served by another PR. as it stands, if you generate with this grammar from the docs:

<dna> ::= <base> | <base> <dna>
<base> ::= 'A' | 'C' | 'G' | 'T'

you're going to get the same, pretty random, non-deterministic output, i.e.

Ok("AAGGC")
Ok("GCGT")
Ok("TTTC")
Ok("C")

Second! I'm still kind of toying with the idea of changing the signature on generate to take an optional starting rule, i.e. the LHS of the prod you want to "start" generating from. On one hand, it means maybe if you want to generate from different parts of a large grammar, you can more readily. On the other, it clutters the invocation with a parameter that if you don't want to use it, have to deal with passing None to opt out. It also adds another fail case / place the API can break on ya, i.e. if you pass in a starting rule that doesn't look like the starting rules we know about...

Maybe... that could be another PR and instead of adding an Option to the generate signature, we could add a non-breaking, generate_from or something if we learn there's an interest.

shnewto commented 1 year ago

woah that clippy action is great, will address those fixes a bit later but also, TIL that default cargo clippy I guess ignores tests and you need one of --all-features --all-targets to get these lints locally?

CrockAgile commented 1 year ago

Yeah since cfg(test) is technically a "feature", compilation and clippy don't include any test (or other features/targets) without those flags.

I personally think test should at least be a default clippy feature enabled, but w/e