Closed JessedeDoes closed 7 months ago
The only processing instruction that we implemented in FoLiA is the xml-stylesheet one. All others are ignored, and apparently not correctly handled in the validation step.
It might be a good idea to incorporate a processing instruction node in FoLiA. Seems rather straightforward to implement.
Thanks for the report Looks like a bug indeed, I'll have to pinpoint it. It should indeed be simply ignored by the validator.
I did some reading in the lxml/etree documentation, and it seems that the Python parsers DISCARD all Processing Instructions in the input file:
Note that XMLParser skips over processing instructions in the input instead of creating comment objects for them. An ElementTree will only contain processing instruction nodes if they have been inserted into to the tree using one of the Element methods.<
The same yields for XML Comments. Quite a bummer.
Implementation of PI's in libfolia is a nobrainer.
I implemented support for PI's in libfolia. Just to be complete. This might only be limited useful, but ok.
I implemented a fix (simply ignoring processing instructions) in foliapy, pending release still
(fixed in FoLiA-tools v2.5.5)
FoLiA-tools v2.5.4, using FoLiA v2.5.1 with library FoLiApy v2.5.8
gcnd.test.folia.xml.txt
This file validates when I omit the processing instructions (
<?n_elan_annotations 2?>
etc), but with the processing instructions:(Of course, I can use a comment or something else for this type of information, so it is not a showstopper.)