acoli-repo / acoli-dicts

3000+ machine-readable open source dictionaries distributed by the Applied Computational Linguistics lab at the University of Augsburg, Germany, and by the research group Linked Open Dictionaries (LiODi, funded 2015-2020 by BMBF at Goethe University Frankfurt, Germany). All data provided in OntoLex-Lemon and TIAD-TSV.
Apache License 2.0
10 stars 2 forks source link

Apertium RDF - metadata per dictionary #21

Open jogracia opened 2 years ago

jogracia commented 2 years ago

Metadata file to be added for every dictionary. For inspiration, I copy the metadata of a dictionary in Apertium RDF v1.0:

Zenodo link could be added as a rdfs:sameAs

jogracia commented 2 years ago

another suggestion: we could add a "version" tag ( v2.2 ?)

max-ionov commented 2 years ago

@jogracia , should this be done automatically? If so, I foresee problems with CKAN, hdl.handle.net and METASHARE since we don't know the IDs in advance (we can generate http://datahub.ckan.io/dataset/apertium-rdf-en-es but not https://datahub.ckan.io/dataset/47e9d8cc-5da9-4c02-960e-c00abee2b0d9

chiarcos commented 2 years ago

This is part of the general problem of Linked-Data incompliant data hosting policy I've been complaining about at times. In this particular case, we may find a partial workaround by using Purl or (better) W3ID, but these cannot be automatized either, but have to be manually set up and maintained.

Am Di., 12. Apr. 2022 um 12:54 Uhr schrieb Max Ionov < @.***>:

@jogracia https://github.com/jogracia , should this be done automatically? If so, I foresee problems with CKAN, hdl.handle.net and METASHARE since we don't know the IDs in advance (we can generate http://datahub.ckan.io/dataset/apertium-rdf-en-es but not https://datahub.ckan.io/dataset/47e9d8cc-5da9-4c02-960e-c00abee2b0d9

— Reply to this email directly, view it on GitHub https://github.com/acoli-repo/acoli-dicts/issues/21#issuecomment-1096568562, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATZWSPX54SV4EETIZKBWM3VEVI5BANCNFSM5OAI3SCQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jogracia commented 2 years ago

Good point. What we actually did for Apertium RDF v1.0 was to upload the data into Datahub and then consult the generated IDs and manually include the sameAs statements in the metadata files, to finally update only the metadata files (the data remained unchanged). Now Datahub is discontinued, but we can do the same with Zenodo (seeAlso) and Linghub (sameAs) once the data is documented there.

chiarcos commented 2 years ago

But shouldn't the metadata not be bundled with the data? If I recall correctly, Zenodo (at least) will generate a new DOI/URI for every version, so there is no good way to update previously deposited metadata (you need to ask them for every single item). See first point under https://help.zenodo.org/.

max-ionov commented 2 years ago

It will generate DOI for a new version, but there will still be a DOI for the "top" version. I think this should be okay