invisibleXML / ixml

Invisible XML
GNU General Public License v3.0
51 stars 7 forks source link

Combining grammars (for v.Next) #75

Open ndw opened 2 years ago

ndw commented 2 years ago

I'd like to be able to import one grammar into another. Suppose, I have a grammar for parsing BCP 47 language codes or ISO 8601 dates, or what-have-you. It would be nice to be able to combine them by importing or including them. Consider:

language-list: Language-Tag++(',', ' '*) .

#include 'bcp47.ixml' .

That gives me a new grammar that accepts a list of BCP 47 language tags.

We would have to work out the semantics of "duplicated" rule names. I'm inclined towards replace semantics.

I'm not proposing that this should work blindly for combining any two grammars, only that if an author knows the names of the nonterminals in one grammar, they should be able to construct another so that they can include the former by reference.

sydb commented 2 years ago

+1

Not sure exactly what @ndw means by “replace semantics” (I suppose I should just ask him), but my instinct is that a rule defining a non-terminal in the driver grammar should supersede a rule defining the same non-terminal in an included grammar. If the same non-terminal is defined by 2 (or more) included grammars at the same level, that should be an error (a violation of Conformance of grammars, which I have to admit, I could not find elsewhere in the spec).

I also think the form of the included grammar (ixml vs XML) probably should have to match that of the including grammar, just to make life easier on implementers.

cmsmcq commented 2 years ago

Sometimes I believe I will want replace semantics (which Syd calls 'superseding'); at other times I am confident that I will want production rules in the including grammar to be merged with those of the included grammar. For the standard formal treatment of grammars as having a set of production rules, the merging semantics would be very natural.

At this point I am inclined to think that the territory here may be mapped out best by the rules for combination of Relax NG schemas (though there are constructs there I avoid using because I do not understand them at all; I might prefer not to construct analogs to those).