clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
41 stars 52 forks source link

ES-GA - help needed with syntactic annotation errors in sample #618

Closed adina-v closed 1 year ago

adina-v commented 1 year ago

I´m trying to validate the sample and these errors are coming up`in the annotated files: ERROR: Can't find local id for link/@ana="ud-syn:nsubj:pass" ERROR: Can't find local id for link/@ana="ud-syn:aux:pass"

I´d be grateful for any help on how to fix this. Thank you!

matyaskopp commented 1 year ago

replace colon with underscore: link/@ana="ud-syn:nsubj_pass"

it should refer to this category: https://github.com/clarin-eric/ParlaMint/blob/031ec3009386a4bfec60bf0e22f653a813ddf98c/Data/Taxonomies/ParlaMint-taxonomy-UD-SYN.ana.xml#L1046-L1048

adina-v commented 1 year ago

Thank you! (I understand that we should replace our current ParlaMint-taxonomy-UD-SYN.ana with the one you provided?)

Also, another quick question: We are getting this error: error: attribute "lemma" not allowed here; expected attribute "ana", "join", "msd", "norm", "pos" or "xml:lang"

This is an example of our current annotation that triggers this error:

<pc xml:id="ParlaMint-ES-GA_2016-12-14-DSPG008.seg13.s1.w5" lemma=":" msd="UPosTag=PUNCT">:</pc>
<pc xml:id="ParlaMint-ES-GA_2016-12-14-DSPG008.seg13.s1.w6" lemma="¿" msd="UPosTag=PUNCT" join="right">¿</pc>

I understand that we should have instead

<pc xml:id="ParlaMint-ES-GA_2016-12-14-DSPG008.seg13.s1.w5" msd="UPosTag=PUNCT">:</pc>
<pc xml:id="ParlaMint-ES-GA_2016-12-14-DSPG008.seg13.s1.w6" msd="UPosTag=PUNCT" join="right">¿</pc>

Is that correct? Thank you!

matyaskopp commented 1 year ago

Thank you! (I understand that we should replace our current ParlaMint-taxonomy-UD-SYN.ana with the one you provided?)

Yes, it is better to use this taxonomy and avoid possible errors with missing categories in future. We plan to use this taxonomy for all corpora - this taxonomy contains all documented relations (if not, please report it)

I understand that we should have instead

Yes <pc> element does not allow lemma attribute.

adina-v commented 1 year ago

Great, thank you!

TomazErjavec commented 1 year ago

I think this has all been sorted out, so I am closing this issue.