Conversion of TEI P5 based formats to KorAP-XML
BSD 2-Clause "Simplified" License
2 stars 0 forks source link

header validationnnnot really... #3

Closed bansp closed 2 years ago

bansp commented 2 years ago

I'm looking at a fragment of the Gingko corpus header from https://github.com/KorAP/KorAP-XML-Krill/blob/master/t/real/corpus/Gingko/ATZ07/JAN/00001/header.xml :

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="header.rng"
<idsHeader TEIform="teiHeader" pattern="text" status="new" type="text" version="1.0">
            <t.title assemblage="external">ATZ07/JAN.00001 ATZ - Automobiltechnische Zeitschrift, Januar 2007, Nr. 109(1), S. 10-15; Ein neues Energiemanagement-Konzept für das elektrische Bordnetz</t.title>

And it seems to me that both schema statements are ornaments, because

Fixing the root element in the DTD doesn't move us forward, because then lots of errors crop up that suggest that the DTD hasn't been updated while the XML has evolved.

Removing both schema associations would make the result truer, in a way ;-) <!DOCTYPE idsHeader> also works, but, naturally, it doesn't do much.

Akron commented 2 years ago

You are right - these references seem to be independent of the header source. So as a quick fix I removed them from generation.