Open tadman opened 2 years ago
Reason is here:
https://github.com/pest-parser/pest/blob/v2.5.3/meta/src/optimizer/unroller.rs#L52
If we define a repeat_min_max e.g.: text = { a{1,255} }
, that logic will create 255 AST node in optimizer
Is there a recommended way of dealing with length-limited fields when parsing? I might be doing it wrong, but I'd like to be able to do this at the parser level so it can just cut out and quit if fed egregiously out of spec data.
When implementing a spec based on an RFC, I wrote a rule like
text{0,998}
, as the spec (RFC5322) dictates that lines cannot exceed 998 characters.This dramatically increased compile times from negligible to many seconds. Changing this to
text*
and doing that validation step in other code eliminates the problem.By varying the repeat count in the rule and running tests on an Mac Mini (M1 2020):
text{0,2048}
= 36.04stext{0,998}
= 8.77stext{0,512}
= 2.94stext{0,256}
= 1.12stext{0,128}
= 0.62sBeyond a certain point it's just a "stack overflow":
text{0,3000}
=fatal runtime error: stack overflow
Is this a known limitation of the implementation of limited repeat?
Sample grammar:
I tried this in the snippet generator but I think it can't handle it because of this issue.