elexis-eu / tei2ontolex

TEI to OntoLex Conversion
Apache License 2.0
6 stars 2 forks source link

Wrap everything in lime:Lexicon with some DublinCore terms #16

Open kernc opened 3 years ago

kernc commented 3 years ago

I'd expect some basic original dictionary metadata to persist the transformation.

I absolutely have no idea what I'm doing, and this is my first XSLT ever, so please kindly advise.

jmccrae commented 3 years ago

Seems good from an OntoLex perspective. If @laurentromary is happy with this, we can merge.

laurentromary commented 3 years ago

No problem apart the comments I made. Who is @kernc, BTW?

kernc commented 3 years ago

Just a passer by making sure language technologies are up to speed. :stuck_out_tongue_closed_eyes:

From Lemon-Lime perspective, the Lexicon seems to be missing a further language:

<owl:Restriction>
    <owl:onProperty rdf:resource="http://www.w3.org/ns/lemon/lime#language"/>
    <owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">1</owl:cardinality>
</owl:Restriction>

But unsure how to construct it, I just left it out. :sweat_smile:

Updated the changed tests.

kernc commented 3 years ago

The prior version had issues. :sweat_smile: The DC terms transformation has now been amended to take into account only nodes in TEI/teiHeader/fileDesc, and to use only matching nodes' arbitrarily-nested normalized text content.

Priorly, I rebuilt the tests with a simple call to g.serialize(...) in test.convert_tei_to_ontolex(). They've now been went over and updated to produce a smaller, friendlier diff.

There are additional Lexicon properties I'd like to extract, namely dc:description, dc:subject, lime:language, but I'm not sure where in the TEI document to canonically find them. Welcome your thoughts.