Closed fginter closed 5 years ago
As discussed elsewhere, this format only check could either
--lang
is provided. With the online validation (where the proper --lang
is used), I think there is no risk of releasing an invalid treebank. Or--formatonly
(or something like that)It is worth considering which tests should be included in the format-only version. We agree lang-specific deprels and spaces in forms/lemmas should be allowed. What about
The new validator can test on 5 levels (the --level option):
Allow a basic CoNLL-U format check without the extra character set and symbol list restrictions imposed by UD.