Closed sulir closed 3 years ago
@sulir if you remember this ticket, what did you mean by "this style should allow less strict ambiguity detection"? Thanks
I cannot remember exactly, but probably it is related to cases such as this one:
rule = "a": 0000 |
"b": 00 subrule(2);
The recognition would be performed in the order defined in the file, so if some variant matches, the generated return
statement is executed and the rest of the variants is not tested.
But the approach may have some shortcomings, and maybe I am confounding this with some other idea, it was a long time ago.
Hm, this actually reminds me the problem I explained in #34 . E.g., the following example:
instruction = "JMP": line(5) ignore8(8) 000 ignore16(16) |
"JPR": line(5) ignore8(8) 100 ignore16(16) |
"DATA": data(32);
will fail with ambiguity detected, even if I want not to fail. My expectation (and the feature proposal in #34) is that it would try the instructions and only if those are not matched, then the last "branch" - "DATA" will be matched.
One solution as explained in #34 is to support multiple root rules, which will allow kind-of mixing the current style with "return-style" of the root rules:
root instruction, data;
instruction = "JMP": line(5) ignore8(8) 000 ignore16(16) |
"JPR": line(5) ignore8(8) 100 ignore16(16);
data = "DATA": data(32);
but on the other hand it is a grammar change. I actually like that the current style detects the ambiguity because I think in most of the cases it's the mistake of a programmer (I've run into many such issues and it was always my error). So I think it would be shame if we lost this capability by expanding the return-style in general.
I guess the proposal in #34 solves this best as far as I can see.
Therefore I propose to close this ticket...
OK.
There exists an another possible style of generated decoder code, which uses the
return
statement after a variant is successfully recognized and throws an exception at the end of each rule. Thedefault
branches are not used at all.This style should allow less strict ambiguity detection and the performance could be improved. However, it would bring some new issues which must be concerned, e.g. the instruction length recognition would be more difficult.