proycon / folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
http://proycon.github.io/folia/
GNU General Public License v3.0
60 stars 10 forks source link

foliavalidator fails on example with a stacktrace #72

Closed kosloot closed 5 years ago

kosloot commented 5 years ago

try: foliavalidator group-annotations.2.0.0.folia.xml -o

This produces a stacktrace:

Traceback (most recent call last):
  File "/usr/local/bin/foliavalidator", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/foliatools/foliavalidator.py", line 106, in main
    r = validate(file, schema, **args.__dict__)
  File "/usr/local/lib/python3.6/dist-packages/foliatools/foliavalidator.py", line 51, in validate
    print(document.xmlstring())
  File "/home/sloot/.local/lib/python3.6/site-packages/folia/main.py", line 8040, in xmlstring
    return str(ElementTree.tostring(self.xml(), xml_declaration=True, pretty_print=True, encoding='utf-8'),'utf-8')
  File "/home/sloot/.local/lib/python3.6/site-packages/folia/main.py", line 7190, in xml
    e.append(text.xml())
  File "/home/sloot/.local/lib/python3.6/site-packages/folia/main.py", line 2306, in xml
    xml = child.xml() #may return None in rare occassions, meaning we wan to skip
  File "/home/sloot/.local/lib/python3.6/site-packages/folia/main.py", line 2306, in xml
    xml = child.xml() #may return None in rare occassions, meaning we wan to skip
  File "/home/sloot/.local/lib/python3.6/site-packages/folia/main.py", line 2306, in xml
    xml = child.xml() #may return None in rare occassions, meaning we wan to skip
  File "/home/sloot/.local/lib/python3.6/site-packages/folia/main.py", line 4826, in xml
    raise ValueError("No set specified or derivable for annotation layer " + self.__class__.__name__)
ValueError: No set specified or derivable for annotation layer EntitiesLayer

I would expect:

  1. Only the last warning about the ValueError
  2. In fact NO warning as this example is NOT in de erroneous subdir
proycon commented 5 years ago

Hmm, right, this is indeed a serious one. I'd better test everything with the -o option enabled to catch serialisation problems.

proycon commented 5 years ago

I fixed this and the validator now does a serialisation check by default, similar to folialint I think (even without -o). If anything goes wrong at that stage (meaning the document is deemed valid already) it will still dump a full traceback as in that case it's probably a library problem and we want people to be able to report on it with as much info as possible.