Closed gossebouma closed 6 years ago
Thanks for reporting this, it should be fixed now. I agree multiple roots is a more severe error than punctuation with children (and obviously we need to break one of these two rules in this sentence unless we change the tag of *
from PUNCT to something else).
BTW a PUNCT enclosed in paired punctuation (parentheses, quotes) occurs in other corpora, not necessarily as children of the root node. I think it would be natural to allow in these cases that the paired punctuation is attached to the symbol inside.
My input is test.conllu file contains:
after running
udapy -s ud.FixPunct < test.conllu
I get:It is a pathological string, nevertheless, this is a nasty bug (it took me a while to realize it was not my conversion script that was wrong....)