Closed dspinellis closed 1 month ago
@evgepab is this something obvious that you can easily fix? Otherwise, I'll look at it.
The file buggy.tar.gz quickly replicates the problem by running the command bin/a3k populate buggy.db datacite buggy.tar.gz
.
This seems to be the JSON data that causes the problem.
{
"container": {},
"reason": null,
"formats": [],
"fundingReferences": [],
"prefix": "10.17031",
"creators": [
{
"nameType": "Personal",
"affiliation": {
"name": "Marine Biological Association"
},
"givenName": "Clare",
"familyName": "Ostle",
"name": "Clare Ostle"
}
],
"registered": "2022-12-05T17:08:07Z",
"language": null,
"source": "api",
"suffix": "637b5e4a8d3ae",
"relatedItems": [],
"descriptions": [
{
"descriptionType": "Abstract",
"description": "CSV file containing CPR data. Taxa are summed within each grouping (given in headings), monthly means have been calculated for a Northeast Pacific region > 1000 m isobath. Plankton abundance counts are recorded according to standard CPR methodology, see Richardson et al. (2006, Prog in Oceanog 68. 27-74). Units are \"Number of cells per sample” for the phytoplankton groupings, and “Number of organisms per sample” for the zooplankton groupings."
}
],
"schemaVersion": null,
"sizes": [],
"metadataVersion": 0,
"types": {
"schemaOrg": "Dataset",
"resourceTypeGeneral": "Dataset",
"citeproc": "dataset",
"bibtex": "misc",
"ris": "DATA",
"resourceType": "dataset"
},
"isActive": true,
"relatedIdentifiers": [],
"created": "2022-12-05T17:08:06Z",
"identifiers": [],
"subjects": [],
"dates": [],
"published": "2022",
"titles": [
{
"title": "Monthly CPR data grouped in Northeast Pacific region (> 1000 m isobath)"
}
],
"geoLocations": [],
"url": "https://doi.mba.ac.uk/data/2956",
"rightsList": [
{
"rightsUri": "https://creativecommons.org/licenses/by-nc/4.0/",
"rights": "Creative Commons NonCommercial 4.0 International"
}
],
"publicationYear": 2022,
"publisher": "The Archive for Marine Species and Habitats Data (DASSH)",
"contentUrl": null,
"contributors": [],
"updated": "2022-12-05T17:08:07Z",
"doi": "10.17031/637b5e4a8d3ae",
"alternateIdentifiers": [],
"state": "findable",
"version": null
}
Note that in correctly loaded data "affiliation" is an array:
"creators": [
{
"nameType": "Personal",
"affiliation": [
{
"name": "Marine Biological Association"
}
],
"givenName": "David",
"familyName": "Johns",
whereas in the problematic data it is a dictionary.
@evgepab is this something obvious that you can easily fix? Otherwise, I'll look at it.
I believe I can look into it! I guess this happens due to the outdated metadata version where they considered that each creator could have only one affiliation.
@evgepab is this something obvious that you can easily fix? Otherwise, I'll look at it.
I believe I can look into it! I guess this happens due to the outdated metadata version where they considered that each creator could have only one affiliation.
Thanks! I woke up with a fix in mind, so I'm implementing it.
When running the following command:
late in the process the following error occurs: