Closed modernserf closed 5 years ago
Yeah, this is something I've noticed too. The main thing holding me back was readability—if you're used to regular expressions, identifier % ","
won't mean anything to you. I couldn't come up with a syntax I was happy with.
But now that I think about it, we could just use the word itself. Something like
FnDefinition = "function" identifier "(" identifier .separated-by "," ")" Block
It's not much shorter than identifier ? ("," identifier)*
to write, but it avoids the repetition while increasing readability even more than the %
operator would.
Just wanted to toss out another possible syntax. When I thought about writing a parser generator, I considered providing regex style limited repeat. (i.e. identifier{
n}
means repeat exactly n times, identifier{
n,
m}
repeat n to m times, identifier{
n, }
repeat n or more times). Assuming you support that (or even if you don't), there is a straight forward modification for repeate separated by. identifier{","}
would mean repeat separated by commas. It could be mixed with numeric limits so that identifier{",", 1}
would be at least one identifier separated by commas.
Combining this with regex-style repeat does make a lot of sense. I think for consistency you'd want identifier{",", 1}
to be exactly one identifier and identifier{",", 1, }
to be 1+.
You're right. My plan was to use the syntax you suggest, but I messed it up when I wrote my comment.
After playing with it for a while, I think I'm going to do this but with a slightly different syntax: identifier{3}
for exactly three, identifier{3-5}
for three to five, and identifier{3+}
for three or more. The comma example then becomes identifier{",", 1+}
, which I find much easier to read.
Never mind; using -
as a keyword disables dashes in identifiers. The identifier{3, 5}
syntax will have to do for ranges.
I just tagged a new version (owl.v4) with support for this. The original example can now be written like:
FnDefinition = "function" identifier "(" identifier{","} ")" Block
And thanks @WalkerCodeRanger for the suggestion; I'm much happier with this syntax than anything I had come up with before.
I find myself writing some variation of
RuleName ? ("," RuleName)*
repeatedly, even in simple grammars. This is not a tremendous hardship, but it would be more elegant to have a "separated by" operator, similar to those that come in a lot of parser combinator libraries. For example:would instead be:
Again, I recognize that this is merely syntactic sugar, but the
+
and?
operators are also merely syntactic sugar, but they reduce repetition and increase readability.