djc / rnc2rng

RELAX NG Compact to regular syntax conversion library
MIT License
16 stars 13 forks source link

Parsing of documented groups fails #16

Closed puetzk closed 5 months ago

puetzk commented 5 years ago

e.g.

element foo {
    ## documentation
    (
        attribute a {xsd:string},
        attribute b {xsd:string}
    )?
}

Will fail to parse, giving the error parse error in test.rnc [3:2]

This is, as far as I can tell, legal syntax: https://www.oasis-open.org/committees/relax-ng/compact-20021121.html §5.2 says

## comments can only be used immediately before before a pattern, nameClass, grammarContent or includeContent. Multiple ## comments are allowed. Multiple adjacent ## comments without any intervening blank lines are merged into a single documentation element. Any ## comments must precede any annotation in square brackets.

And parenthesized patterns are still patterns per the syntax in §2

pattern ::= ... "(" pattern ")

I think (not completely confident that I've mapped how you're naming the productions vs what that spec uses) that the issue is just at https://github.com/djc/rnc2rng/blob/2b2c3c7c7f6a59055e68aea53769028609b469e9/rnc2rng/parser.py#L394-L401

And specifically that the fist rule should have been for primary, not annoated-primary, e.g.

-@pg.production('annotated-primary : LPAREN pattern RPAREN')
+@pg.production('primary : LPAREN pattern RPAREN')

Thus still allowing the next production to attach annotations in front of such a group.

That change does make the grammar where I first encountered the issue parse and agree with how trang translates this constract to rng (i.e. something like the following, though your translation is more eager about using a child <name ns="..."> element, whereas trang prefers to use a name attribute which is transformed to this form. I don't think that causes any real difference in semantics though.

    <element name="foo">
      <optional>
        <group>
          <a:documentation>documentation</a:documentation>
          <attribute name="a">
            <data type="string"/>
          </attribute>
          <attribute name="b">
            <data type="string"/>
          </attribute>
        </group>
      </optional>
    </element>

rnc2rng also seems to parse, but not emit, documentation elements for references, e.g.

ref = element bar { empty }

start = element foo {
    ## documentation
    ref
}

emits

<grammar xmlns="http://relaxng.org/ns/structure/1.0">
  <define name="ref">
    <element>
      <name ns="">bar</name>
      <empty/>
    </element>
  </define>
  <start>
    <element>
      <name ns="">foo</name>
      <ref name="ref"/>
    </element>
  </start>
</grammar>

instead of trang's

<grammar xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0" xmlns="http://relaxng.org/ns/structure/1.0">
  <define name="ref">
    <element name="bar">
      <empty/>
    </element>
  </define>
  <start>
    <element name="foo">
      <ref name="ref">
        <a:documentation>documentation</a:documentation>
      </ref>
    </element>
  </start>
</grammar>

i.e. the <a:documentation> for the ref is missing. I haven't figured out where this is going wrong, so I don't know if it's truly related or should be a separate issue.

djc commented 5 years ago

Hey, thanks for reporting! Since you're already pretty deep into the investigation, would you mind contributing a PR to fix these issues? Or if it's hard to find a fix for the second issue, a PR contributing a test case would also be great to have.

djc commented 4 years ago

@puetzk oh hey, I'm just now noticing that you managed to come up with what looks like a fix for this issue? Are you still interested in submitting this as a PR, or are there issues with your approach?

puetzk commented 4 years ago

As I recall it was working (and I then ended up fixing a couple other namespace things), but it does look like I never actually opened the PR. Oops. I'll take a look at the documents I originally had problems with to refresh my memory of whether there was anything else still wrong and get one opened.

puetzk commented 4 years ago

I know that at least the commits in https://github.com/djc/rnc2rng/compare/master...puetzk:annotations were working and added tests to exercise the cases they were meant to change. I just don't recall if they were actually sufficient to make my whole grammar work as intended or if there was something else that still needed attention.

djc commented 4 years ago

Okay, would be great if you can submit them as a PR!