Fixes #201, wherein the precedence of modified strings like r"..." was wrong, causing those strings to be parsed as a reference r followed by a string literal. This is technically a breaking change, since grammars like the following are no longer valid:
foo = baz"bar"
baz = "baz"
However, the fix is fairly straightforward (add some spaces), and this was probably a fairly rare occurrence anyway.
I'm guessing this was not previously caught because modifiers were only useful for ~r"regex nodes", where the precedence was correct. However, modified string literals are required for next feature.
Support for parsing binary files
This turned out to be much easier than I'd anticipated because of Erik’s clever use of ast.literal_eval to evaluate string literals.
Once you can define a bytes literal in a grammar, everything "just works" because at base, parsimonious is just calling .startswith(...), .endswith(...) and re.match, all of which work fine as long as the arguments' types match (str xor bytes).
To make this feature easier to use, I added a validation that all string literals (and by extension regexes) must be of the same type.
I added some documentation for the feature, but I'm happy to add to another section if you think it's warranted.
Testing
All of the changes are tested by unit tests asserting:
Precedence of modified string literals is correct.
Grammar strings with bytes literals inside can be parsed into grammars.
Grammar strings with both bytes and str literals raise an error.
Changes
Fix precedence of modified string literals
Fixes #201, wherein the precedence of modified strings like
r"..."
was wrong, causing those strings to be parsed as a referencer
followed by a string literal. This is technically a breaking change, since grammars like the following are no longer valid:However, the fix is fairly straightforward (add some spaces), and this was probably a fairly rare occurrence anyway.
I'm guessing this was not previously caught because modifiers were only useful for
~r"regex nodes"
, where the precedence was correct. However, modified string literals are required for next feature.Support for parsing binary files
This turned out to be much easier than I'd anticipated because of Erik’s clever use of
ast.literal_eval
to evaluate string literals.Once you can define a bytes literal in a grammar, everything "just works" because at base, parsimonious is just calling
.startswith(...)
,.endswith(...)
andre.match
, all of which work fine as long as the arguments' types match (str
xorbytes
).To make this feature easier to use, I added a validation that all string literals (and by extension regexes) must be of the same type.
I added some documentation for the feature, but I'm happy to add to another section if you think it's warranted.
Testing
All of the changes are tested by unit tests asserting: