LanguageMachines / libfolia

FoLiA library for C++
https://proycon.github.io/folia
GNU General Public License v3.0
15 stars 7 forks source link

spurious 'text' element is not detected #43

Closed kosloot closed 4 years ago

kosloot commented 4 years ago

Given this simple file:

<?xml version="1.0" encoding="UTF-8"?>
<FoLiA xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://ilk.uvt.nl/folia" xml:id="build1" generator="libfolia-v2.4" version="2.2.1">
  <metadata type="native">
    <annotations>
      <paragraph-annotation/>
      <text-annotation set="https://raw.githubusercontent.com/proycon/folia/master/setdefinitions/text.foliaset.ttl"/>
    </annotations>
  </metadata>
  <text xml:id="build1.text">
    <p xml:id="p3">
      <t>paragraaf 3</t>
    </p>
    </text>wat nu?
</FoLiA>

folialint doesn't detect the spurious 'wat nu?' after the final \</text> and states that the file is valid FoLiA

foliavalidator DOES detect the problem:

Error on line 2: Element FoLiA has extra content: text
VALIDATION ERROR against RelaxNG schema (stage 1/3), in /home/sloot/Downloads/build.xml
Element FoLiA has extra content: text, line 2

When the 'wat nu?' is moved to the \</p> node, both programs complain, but the message from folialint is quite cryptic:

XML error: Unable to append object of type _XmlText to a <text> (id=build1.text)

It would be nice to have a clearer message by folialint AND to detect it on the 'top' node too.

kosloot commented 4 years ago

The second case mentioned above is better handled by folialint now:

XML error: element <text> has extra text 'wat nu?', NOT allowed there.
kosloot commented 4 years ago

Fixed now, with even cleaner error message.