Closed matyaskopp closed 2 years ago
The current version of CoNLLu validator doesn't make recursion depth exception: https://github.com/UniversalDependencies/tools/commit/67cdccbac56ffc8a3801b34e05e0fa9052031c9f
But I am still suggesting to consider this issue. @BartJongejan, if you don't want to fix it, close it.
I have checked whether our segmenter/tokenizer program has done something unexpected. My and my colleagues' conclusion is that it didn't. So, for this particular 'sentence', the only option left is to manually split the sentence. We have decided not to do that, since it is an improvement with little value. Please close this issue. (I cannot do that, it seems.)
Ok, thanks for checking it.
This error appears also in the original log, that has been produced during ParlaMint sample creation. @TomazErjavec reported by email (2021-04-20):
This error appears only in DK corpus. And probably can be fixed as
ParlaMint-DK_20141008130437.seg2.7]
does not look like a sentence. I guess it can be split into multiple segments. The source data(https://www.ft.dk/forhandlinger/20141/20141M002_2014-10-08_1300.htm) looks like multiple glued lists to me:error:
https://github.com/clarin-eric/ParlaMint/runs/4704545202?check_suite_focus=true#step:4:188