clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
43 stars 53 forks source link

"ParlaMint-IS-taxonomy-parla.subcorpus.xml" instead of "ParlaMint-taxonomy-subcorpus.xml" #477

Closed starkadur closed 1 year ago

starkadur commented 1 year ago

I factorized ParlaMint-Is while a go and all seemed to go well. But when looking at the taxonomies I notice that I have the included file "ParlaMint-IS-taxonomy-parla.subcorpus.xml". Should it not be "ParlaMint-taxonomy-subcorpus.xml" ?

Was it made corpus specific (and thus distinguished by including the country code IS) because of the "parla" in taxonomy-parla.subcorpus? I think I did not receive no error because of this.

matyaskopp commented 1 year ago

It happened because you did not have a taxonomy[@xml:id="subcorpus"] but you have used taxonomy[@xml:id="parla.subcorpus"]. The factorization script check if it knows the taxonomy:

known common taxonomies: https://github.com/clarin-eric/ParlaMint/blob/c5fd7aaee29b18cf537a84850d41c4930250de51/Scripts/parlamint-factorize-teiHeader.xsl#L32-L39

TomazErjavec commented 1 year ago

I think this has all been fixed, so, closing.