IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
876 stars 484 forks source link

Spike: Review use of Dublin Core date metadata in dataset page HTML metatags and in DC metadata published over OAI-PMH #9288

Open jggautier opened 1 year ago

jggautier commented 1 year ago

I was updating the Dataverse metadata crosswalk to account for metadata mapping changes added in Dataverse v5.12.1, and I noticed how the Dublin Core Elements Date property now contains different dates depending on how the metadata is published.

How Dublin Core Elements' Date property is used in Dataverse

Dataverse publishes dataset metadata using Dublin Core Elements in two ways:

Change in v5.12.1 to improve metadata published over OAI-PMH

The two screenshots above show metadata for a dataset with no Production Date, and the Publication Date, 2023-01-11, is used in the DC Elements' Date property in the record published in the OAI-PMH feed and in the dataset page's HTML metatags.

In v5.12.1 (https://github.com/IQSS/dataverse/pull/8733), when Dataverse publishes dataset metadata over OAI-PMH and when that dataset has no Production Date, logic was added to use a dataset's Publication Date for the DC Date property. This ensures there's always a dc:date in metadata published over OAI-PMH using Dublin Core Elements (oai_dc), so that when that metadata is harvested, the harvester is able to display a date for the dataset that gives people some idea about when the data in the dataset was made available.

But when a dataset does have a Production Date, as with the dataset at https://doi.org/10.7910/DVN/ONIAZI, the date in the dc:date property is different:

Purpose of HTML metatags versus purpose of OAI-PMH The metatags were added in v4.7 (https://github.com/IQSS/dataverse/issues/1393) to improve "Zotero/Endnote import, search engine discoverability, and support for citation tracking tools like Altmetrics". I think publishing metadata over OAI-PMH serves a similar purpose, making metadata discoverable in different places.

Questions Does the Dataverse community agree that in both cases, when Dataverse publishes metadata over OAI-PMH and in dataset page metatags, it should prefer using Production Date in the dc:date property, and use the Publication Date only when there's no Production Date?

If so, I suggest we use this logic for each dataset page's metatags. That would mean that for the example dataset that has a Production Date, the DC.date in the dataset page's metatags would also use the Production Date (2018-11-24).

Definition of done: The community is able to review how Dataverse uses Publication Dates and Production Dates when publishing metadata in different ways and decide if the current logic is okay or should be changed.

If the logic should be changed, we could open another GitHub issue to describe the changes and track changes we need to make to the software.

jggautier commented 1 month ago

I've revisited this GitHub issue as part of an effort to review and prioritize work proposed in GitHub issues in the IQSS/Dataverse repo that have been opened for years (https://github.com/IQSS/dataverse-pm/issues/114).

I verified that what I described in this issue is still accurate as of the latest Dataverse release, v6.2, and edited the title and description a little for clarity.