LanguageMachines / libfolia

FoLiA library for C++
https://proycon.github.io/folia
GNU General Public License v3.0
15 stars 7 forks source link

folialint reject valid input #52

Closed kosloot closed 1 year ago

kosloot commented 1 year ago

given this FoLiA file:

<?xml version="1.0" encoding="UTF-8"?>
<FoLiA xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://ilk.uvt.nl/folia" xml:id="bugxx" generator="libfolia-v1.11" version="2.5">
  <metadata type="native">
    <annotations>
      <text-annotation set="https://raw.githubusercontent.com/proycon/folia/master/setdefinitions/text.foliaset.ttl"/>
      <division-annotation/>
      <paragraph-annotation/>
      <sentence-annotation/>
      <hyphenation-annotation/>
      <token-annotation/>
    </annotations>
  </metadata>
  <text xml:id="bug">
    <div xml:id="bug.text.1.div">
      <p xml:id="bug.text.1.div.p">
        <s xml:id="bug.text.1.div.p.s">
          <w xml:id="bug.text.1.div.p.s.w.4">
            <t><t-hbr>-</t-hbr></t>
      </w>
        </s>
      </p>
    </div>
  </text>
</FoLiA>

folialint rejects this:

$ folialint tests/bugxx.xml
tests/bugxx.xml failed: tests/bugxx.xml: attempt to add an empty <t> to word: bug.text.1.div.p.s.w.4

foliavalidator accepts it:

$ foliavalidator tests/bugxx.xml 
Validated successfully: tests/bugxx.xml

I assume libfolia is mistakenly interpreting the <t-hbr>-<t-hbr/> as "empty" The same problem also pops up for just <t-hbr/>