Closed matyaskopp closed 1 year ago
Good eye, Matyáš, thanks for reporting.
Many minor corrections have been performed after I submitted the corpora, and the easiest way to fix this would be to edit the latest version, which I do not have.
@TomazErjavec, can you share the location of the latest corpus, I can sed
and correct it. In the mean time I will fix the sample.
Indeed, well spotted @matyaskopp ! @5roop, your files are on new-tantra /project/corpora/Parla/ParlaMint/V3/Data Looking forward to the corrected version.
@TomazErjavec, @matyaskopp, this is now taken care of.
The new version is here: new-tantra:/home/rupnik/parlamint2/ParlaMint_fixing_635
. It seems the only bad files were the root TEI documents, both plain-text and ana.
It seems the only bad files were the root TEI documents, both plain-text and ana.
If this is so, no need for me to take the complete new version - I just corrected titles in the two files, and will re-run.
in both TEI and TEI.ana versions https://github.com/clarin-eric/ParlaMint/blob/937681cf012cc7330025a8052c60cb05b1bc25ae/Data/ParlaMint-RS/ParlaMint-RS.xml#L7