clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
50 stars 53 forks source link

link checking does not validate factorized header files validate-parlamint.pl #604

Closed matyaskopp closed 1 year ago

matyaskopp commented 1 year ago

https://github.com/clarin-eric/ParlaMint/actions/runs/4098231856/jobs/7067195305#step:4:1217 non-working validate-parlamint.pl

  INFO: Link checking for ParlaMint-UA.xml
  INFO: Validating file included in teiHeader /home/runner/work/ParlaMint/ParlaMint/ParlaMint/Data/ParlaMint-UA/ParlaMint-taxonomy-parla.legislature.xml
  INFO: Char validation for ParlaMint-taxonomy-parla.legislature.xml
  INFO: XML validation for ParlaMint-taxonomy-parla.legislature.xml
  INFO: Validating file included in teiHeader /home/runner/work/ParlaMint/ParlaMint/ParlaMint/Data/ParlaMint-UA/ParlaMint-taxonomy-speaker_types.xml
  INFO: Char validation for ParlaMint-taxonomy-speaker_types.xml
  INFO: XML validation for ParlaMint-taxonomy-speaker_types.xml
  INFO: Validating file included in teiHeader /home/runner/work/ParlaMint/ParlaMint/ParlaMint/Data/ParlaMint-UA/ParlaMint-taxonomy-subcorpus.xml
  INFO: Char validation for ParlaMint-taxonomy-subcorpus.xml
  INFO: XML validation for ParlaMint-taxonomy-subcorpus.xml
  INFO: Validating file included in teiHeader /home/runner/work/ParlaMint/ParlaMint/ParlaMint/Data/ParlaMint-UA/ParlaMint-UA-listOrg.xml
  INFO: Char validation for ParlaMint-UA-listOrg.xml
  INFO: XML validation for ParlaMint-UA-listOrg.xml
  INFO: Validating file included in teiHeader /home/runner/work/ParlaMint/ParlaMint/ParlaMint/Data/ParlaMint-UA/ParlaMint-UA-listPerson.xml
  INFO: Char validation for ParlaMint-UA-listPerson.xml
  INFO: XML validation for ParlaMint-UA-listPerson.xml

working make check-links target:

make check-links-UA PARLIAMENTS=UA 2>&1
for root in `find Data -type f -path "Data/ParlaMint-UA/ParlaMint-*.xml" | grep -P "ParlaMint-UA(|\.ana).xml"`; do \
  echo "checking links in root:" ${root}; \
  java  -jar /usr/share/java/saxon.jar -xsl:Scripts/check-links.xsl ${root}; \
  for component in `echo ${root}| xargs -I % java -cp /usr/share/java/saxon.jar net.sf.saxon.Query -xi:off \!method=adaptive -qs:'//*[local-name()="teiHeader"]//*[local-name()="include"]/@href' -s:% |sed 's/^ *href="//;s/"//'`; do \
    echo "checking links in header component:" Data/ParlaMint-UA/${component}; \
    java  -jar /usr/share/java/saxon.jar meta=/home/matyas/Documents/UFAL/REP/ParlaMint-matyaskopp/${root} -xsl:Scripts/check-links.xsl Data/ParlaMint-UA/${component}; \
  done; \
  for component in `echo ${root}| xargs -I % java -cp /usr/share/java/saxon.jar net.sf.saxon.Query -xi:off \!method=adaptive -qs:'/*/*[local-name()="include"]/@href' -s:% |sed 's/^ *href="//;s/"//'`; do \
    echo "checking links in component:" Data/ParlaMint-UA/${component}; \
    java  -jar /usr/share/java/saxon.jar meta=/home/matyas/Documents/UFAL/REP/ParlaMint-matyaskopp/${root} -xsl:Scripts/check-links.xsl Data/ParlaMint-UA/${component}; \
  done; \
done
checking links in root: Data/ParlaMint-UA/ParlaMint-UA.ana.xml
checking links in header component: Data/ParlaMint-UA/ParlaMint-taxonomy-NER.ana.xml
checking links in header component: Data/ParlaMint-UA/ParlaMint-taxonomy-UD-SYN.ana.xml
checking links in header component: Data/ParlaMint-UA/ParlaMint-taxonomy-parla.legislature.xml
checking links in header component: Data/ParlaMint-UA/ParlaMint-taxonomy-speaker_types.xml
checking links in header component: Data/ParlaMint-UA/ParlaMint-taxonomy-subcorpus.xml
checking links in header component: Data/ParlaMint-UA/ParlaMint-UA-listOrg.xml
ERROR ParlaMint-UA-listOrg: ERROR: Can't find local id for relation/@passive="#pp.ed"
checking links in header component: Data/ParlaMint-UA/ParlaMint-UA-listPerson.xml
ERROR ParlaMint-UA-listPerson: ERROR: Can't find local id for affiliation/@ana="#GOV.UA.Shmyhal #acting"
ERROR ParlaMint-UA-listPerson: ERROR: Can't find local id for affiliation/@ana="#GOV.UA.Shmyhal #acting"
matyaskopp commented 1 year ago

seems to be fixed in devel and ana version