CSL schema fails: no non-element tokens allowed at top level

jayvdb commented 8 years ago

CSL schema here: https://github.com/citation-style-language/schema/blob/master/csl.rnc

$ /usr/bin/rnc2rng csl.rnc > csl.rng
Traceback (most recent call last):
  File "/usr/bin/rnc2rng", line 9, in <module>
    load_entry_point('rnc2rng==1.7', 'console_scripts', 'rnc2rng')()
  File "build/bdist.linux-x86_64/egg/rnc2rng/__main__.py", line 9, in main
  File "build/bdist.linux-x86_64/egg/rnc2rng/rnctree.py", line 393, in tree
  File "build/bdist.linux-x86_64/egg/rnc2rng/rnctree.py", line 388, in make_nodetree
  File "build/bdist.linux-x86_64/egg/rnc2rng/rnctree.py", line 372, in scan_NS
rnc2rng.rnctree.ParseError: no non-element tokens allowed at top level

A little bit of playing indicates it doesn't support comments, annotations, dc:foo, sch:ns, include, and almost everything else.

djc commented 8 years ago

Hey, thanks for giving rnc2rng a try!

Yes, the parser is currently quite limited. However, it's rapidly improving. In particular, I now have a new branch that implements a "real" parser based on rply, instead of the ad-hoc thing that's currently there. This should go a long way towards making it more robust and easier to implement more of the standard. However, I wanted to make sure I have some regression tests in place to check that the parts of the standard that the code seems to try to support actually work. That work is now almost done.

If you want to go through the effort of creating somewhat more reduced test cases, I'd be happy to start tackling those as soon as the new parser lands.

djc commented 8 years ago

This took some time (and more than 100 commits), but as of b040414b rnc2rng now supports enough of the Relax NG Compact spec to parse and serialize your csl.rnc (with includes). I don't have a great way to validate that the output is actually correct yet, but it should be reasonably close. Feel free to file any follow-up issues about anything that's incorrect about the outcome.

djc / rnc2rng

CSL schema fails: no non-element tokens allowed at top level #1