BetaMasaheft / Documentation

Die Schriftkultur des christlichen Äthiopiens: Eine multimediale Forschungsumgebung
3 stars 3 forks source link

add prefixDef and refsDecl to all files where they are needed #1069

Closed PietroLiuzzo closed 5 years ago

PietroLiuzzo commented 5 years ago

We should add to all files <prefixDef> in case of users of the non transformed TEI. https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-prefixDef.html

<listPrefixDef>
            <prefixDef ident="bm"
               matchPattern="([a-zA-Z0-9]+)"
               replacementPattern="https://www.zotero.org/groups/358366/ethiostudies/items/tag/bm:$1">
            </prefixDef>
            <prefixDef ident="pleiades"
               matchPattern="(\d{5-8})"
               replacementPattern="https://pleiades.stoa.org/places/$1">
            </prefixDef>
            <prefixDef ident="sdc"
               matchPattern="([a-zA-Z0-9]+)"
               replacementPattern="https://w3id.org/sdc/ontology#$1">
            </prefixDef>
            <prefixDef ident="snap"
               matchPattern="([a-zA-Z]+)"
               replacementPattern="http://data.snapdrgn.net/ontology/snap#$1">
            </prefixDef>
            <prefixDef ident="saws"
               matchPattern="([a-zA-Z]+)"
               replacementPattern="http://purl.org/saws/ontology#$1">
            </prefixDef>
            <prefixDef ident="skos"
               matchPattern="([a-za-zA-Z]+)"
               replacementPattern="http://www.w3.org/2004/02/skos/core#$1">
            </prefixDef>
            <prefixDef ident="gn"
               matchPattern="([a-zA-Z0-9]+)"
               replacementPattern="http://www.geonames.org/ontology#$1">
            </prefixDef>
            <prefixDef ident="dcterms"
               matchPattern="([a-zA-Z]+)"
               replacementPattern="http://purl.org/dc/terms/$1">
            </prefixDef>
            <prefixDef ident="lawd"
               matchPattern="([a-zA-Z]+)"
               replacementPattern="http://lawd.info/ontology/$1">
            </prefixDef>
            <prefixDef ident="syriaca"
               matchPattern="([a-zA-Z]+)"
               replacementPattern="http://syriaca.org/documentation/relations.html#$1">
            </prefixDef>
         </listPrefixDef>

the post.xsl should be updated to use this as well this will require also prefixing of wikidata ids, and potentially of instances of @ref and @corresp

PietroLiuzzo commented 5 years ago

this will require a code freeze and batch upload of the files

PietroLiuzzo commented 5 years ago

https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-refsDecl.html is also needed, especially in manuscript transcriptions and editions. Once this information is there, then the scripts can rely on that information, instead of hardcoding a generic xpath and regex to match the patterns in the xquery (e.g. dts, etc.)

PietroLiuzzo commented 5 years ago

turns out SdC: is no valid namespace. needs to be sdc: https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-teidata.prefix.html

PietroLiuzzo commented 5 years ago

needs to prefix wikidata and update scripts using Q needs to change all SdC and update scripts using SdC needs to retransform all data in RDF to update these namespaces there

PietroLiuzzo commented 5 years ago

update guidelines accordingly

PietroLiuzzo commented 5 years ago

needs also to update the guidelines for this ids

PietroLiuzzo commented 5 years ago

can use https://exist-db.org/exist/apps/doc/xinclude to include taxonomy and prefDecl so that the content is not repeated. needs also xml:base but not in TEI element. for now adding it to teiHeader https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.global.html#tei_att.xml-base

PietroLiuzzo commented 5 years ago

XSLT tranformation will not add xi:include (tried already with prefixes in header and constructing element, it either includes directly or construct an unusable element) in the event a xi:include is there (e.g. RIE 189.xml) then the transformations will always result in the elements being included from the external source. So, this needs me to understand much better how xi:include works, and is then deferred to the next release

PietroLiuzzo commented 5 years ago

now the post.xsl is fixed and returns both the xml with xi:include

I am still not sure what the impact on the processing is and need to test locally before doing this change

PietroLiuzzo commented 5 years ago

The XSLT committed above only partially addresses this because

The example of how to use this was taken from the RIB files.

PietroLiuzzo commented 5 years ago

save-new-entity.xql had the wrong processing instruction for the schema. This still redirect, but although the file has already been fixed, there are plenty of files which have the wrong one. they all need to be updated

PietroLiuzzo commented 5 years ago

The above Xquery solves the problem of how to add the xi:include, because that works nicely done wit XQuery. An issue with the integration with oXygen however, which adds the elements twice does not allow to run this in oxygen, it needs to be run in exist, with exide or with the atom integration. this must happen at data freeze as it will affect all files with no exceptions, and should be tested locally for both the data entry TEI and the canonicalized TEI.

PietroLiuzzo commented 5 years ago

this commit https://github.com/BetaMasaheft/BetMas/commit/a544568ff2571231a9fdd122133c140e5913644b which assumes that there is prefixDef included in the files, fixes int he postprocessed file

PietroLiuzzo commented 5 years ago

a prefix bm is used in cases like bm:LandGrant, for BM concepts. these need to be prefixed differently, like betmas: because bm: is already used for zotero

PietroLiuzzo commented 5 years ago

https://github.com/BetaMasaheft/Authority-Files/commit/698f50868782fe764afe02ecac4032ac00487e83