Open GlenDC opened 7 years ago
A possible fix for this issue(?) would be to change the definition of a block-text to the following:
block-text ::= NL __ '|' (unquoted-text | placeable)+;
Looking back at this issue, it might be that recursive definitions are normal for an EBNF grammar. For example an arglist
and placeable-list
are also defined recursively. While parsers will in most cases parse such cases iterative, as this is more idiomatic for most commonly used programming languages these days.
Is there a good reason this is defined recursively? As there are enough places where definitions are iterative, using the +
operator. I don't see why the situations mentioned in this issue aren't defined in a similar fashion.
@stasm is this issue in fact valid, or should it be marked as a question and it can be closed with some explanation? If you want I can re-open it in https://github.com/projectfluent/syntax, as that is now the new repository of this project, no?
Currently I'm working on a C# implementation of L20n (FTL), which I do mostly based on the specified EBNF grammar.
TLDR: Could we rewrite the
block-text
/unquoted-pattern
section of this EBNF grammar, to make ablock-text
notn*2
-recursive, wheren
is the amount of lines defined for a (group?)block-text
.A
block-text
is defined asNL __ '|' unquoted-pattern
and aunquoted-pattern
is defined as:unquoted-text | placeable | block-text)+
. This is a bit problematic as it is completely recursive in its nature.Not sure if this is a problem in the way I parse it in C#, or in the way the EBNF definition is setup. But the way I would expect it, is that a
block-text
such as:would be simply parsed as 1
block-text
, consisting out of 3unquoted-patterns
{a, b, c}
(which aren'tblock-texts
on their own. In this case those 3unquoted-patterns
would be 3unquoted-texts
{a, b, c}
.The problem is that because of the current definition, this would instead result in 1
block-text
a
that has 1 childblock-text
b
, which has 1 childblock-text
of its ownc
. This is very recursive, making the parsing go very deep in its call-stacks, and I can imagine that a block text could be big enough to actually get a callstack overflow on runtime.My C# implementation is written by hand, so in theory I should be able to fix this by refactoring the code, to be not exactly like the EBNF grammar specified, while still getting the same result. However I was thinking to write parsers automatically based on the EBNF grammar for some other languages, and for those there wouldn't be such a solution possible.