IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
882 stars 494 forks source link

Unconsidered harvesting granularity #11020

Open luddaniel opened 1 week ago

luddaniel commented 1 week ago

Dataverse does not use the good harvesting granularity supported by the repository while harvesting using OAI PHM protocol. It always use the finest harvesting granularity YYYY-MM-DDThh:mm:ssZ.

According to the specification (https://www.openarchives.org/OAI/openarchivesprotocol.html#Dates) :

The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Both arguments must have the same granularity. All repositories must support YYYY-MM-DD. A repository that supports YYYY-MM-DDThh:mm:ssZ should indicate so in the Identify response. A request by a harvester with finer granularity than that supported by a repository must produce an error.

Examples :

https://dataverse.ird.fr/oai?verb=Identify <granularity>YYYY-MM-DDThh:mm:ssZ</granularity> https://dataverse.ird.fr/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z OK https://dataverse.ird.fr/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07 OK

https://api.nakala.fr/oai2?verb=Identify <granularity>YYYY-MM-DD</granularity> https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07 OK https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z Error Code badArgument Which is legitimate! But this is the request sent by Dataverse.

What should be done ?

Request the Identify verb to get the correct granularity before sending a request with from or until arguments. Also YYYY-MM-DD should be the default.