erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.82k stars 128 forks source link

Range and some-of operators #139

Closed na-sa-do closed 2 years ago

na-sa-do commented 6 years ago

Range operators

Stolen Adapted from POSIX regexes:

foo = bar{2}
# expands into
foo = bar bar

foo = bar{2,4}
# expands into
foo = bar bar bar? bar?

# this one isn't part of POSIX, but it's a fairly obvious extension
foo = bar{,3}
# expands into
foo = bar? bar? bar?

Further, suffixing the upper bound with = prevents more instances of bar from existing:

foo = bar{,3=}
# expands into
foo = bar? bar? bar? !bar

This might be worth making the default? I'm not sure. It shouldn't be usable without the prefixed comma, since that sends conflicting messages about whether the number is a lower or upper bound.

Some-of operators

I have no idea what this syntax should look like. This is just what first came to mind.

foo = ~(bar baz bluh){2}
# expands into
foo = (bar baz) / (bar bluh) / (baz bluh)

foo = ~(bar baz bluh){2,3}
# expands into
foo = (bar baz) / (bar bluh) / (baz bluh) / (bar baz bluh)

There isn't much of a point to a lower bound of 0, as that would be equivalent to chaining ?, but it might as well be allowed anyway for simplicity. Note that reordering the parts isn't permitted here; it would be preferable to allow that, but the resulting implicit alternatives would be confusing:

foo = ~(bar baz bluh){2}
# could expand into
foo = (bar baz) / (bar bluh) / (baz bar) / (baz bluh) / (bluh bar) / (bluh baz)
# but what does that parse things as? we just don't know
# (it's still well-defined, of course, just weird)

If we introduce syntax to do this, then a lower bound of 0 does become meaningful. And, again, = is possible:


foo = ~(bar baz bluh){,2=}
# expands into
foo = ((bar baz) / (bar bluh) / (baz bluh)) (!bar !baz !bluh)
``
na-sa-do commented 6 years ago

Lojban's machine grammar uses & for my ~(...){}. This doesn't allow for the {} minimum-maximum addition, but it's a lead to follow, if nothing else.