Closed beemorris closed 1 year ago
Your examples look OK to me. All comments are presentential, i.e., must occur before the line of the first token of the sentence. See the CoNLL-U format specification. I think UDPipe can read comments.
Or did you mean by "presentential" the fact that the "sent_id" comment is not the first comment? But that is formally okay as well. There must be just one "sent_id" comment but its relative position to other comments is not prescribed.
The error message you list (which btw looks quite like the output from validate.py
:-)) could actually mean that the previous sentence was not followed with a blank line, hence the script thinks we are still reading the previous sentence and the current comment occurrs in the middle or at the end of the sentence.
Udpipe complains if there is a comment in front of a sentence, but the validator doesn't pick this up. Is this an issue with UDpipe (e.g. does the format allow pre-sentential comments) or is it an issue with the validator ? Here is an example:
Here is the output from UDpipe: