Closed matyaskopp closed 1 year ago
I see this has already been done. However, I just noticed that we have two languages in NO utterances, namely nno and nob. I fixed this in 5a13cd6.
But, there are two issues here, @tungland:
I occurs to me that maybe I wasn't clear, even to myself: in case NO has been processed simply with a "no" pipeline, we can revert the languages for NO to just "no", as it was, and the problem with multiple ConLL-U files goes away.
@TomazErjavec
ParlaMint we use two-char language codes, when possible. Any special reason you don't use nb and nn?
No reason. Where did i use 3 letter code?
Did you actually run you annotation twice, once for nno and once for nob, and inserted the results in the appropriate utterances? Or is this all done with generic no pipeline
Out model supports both language modes of Norwegian, so yes there was only one annotation for Norwegian.
ParlaMint we use two-char language codes, when possible. Any special reason you don't use nb and nn?
No reason. Where did i use 3 letter code?
@tungland, like here: https://github.com/clarin-eric/ParlaMint/blob/d6ca7bdfa0e2a4394c4f1b8e2921c98c6c1b3fb7/Data/ParlaMint-NO/ParlaMint-NO_1999-03-02-lower.ana.xml#L125
Did you actually run you annotation twice, once for nno and once for nob, and inserted the results in the appropriate utterances? Or is this all done with generic no pipeline
Out model supports both language modes of Norwegian, so yes there was only one annotation for Norwegian.
OK, reverted this change then.
Ah ok. I must have missed this preference. Is this blocking submission?
I must have missed this preference. Is this blocking submission?
No, but you might consider fixing it for 3.1.
I'll make a note of it! Thanks!
I'll make a note of it!
I already did with milestone 3.1. :)
This has all been resolved I think, the final word is in d6216a4: Now bilingual corpora get 3 CoNLL-U files per .ana.xml file:
Currently, only BE uses multiple languages in settings: https://github.com/clarin-eric/ParlaMint/blob/1a838f7d3435941d7e9e06d9ecfdba52fe141dac/Scripts/parlamint2conllu.pl#L32-L62
But there are more parliaments with multiple languages...