inveniosoftware / docs-invenio-rdm

InvenioRDM docs
https://inveniordm.docs.cern.ch/
MIT License
25 stars 68 forks source link

Metadata documentation may not accurately reflect the state of the API's ability to handle fields that are not part of the deposit page. #338

Closed marijane closed 2 years ago

marijane commented 2 years ago

Package version (if known): 8.0

Describe the bug

On the metadata documentation page, the fields sizes, formats, locations, funding, and references indicate they are available via the API, but attempts to add these fields to the record.json file in the InvenioRDM REST API Example will result in a failure to publish the record when Upload.py is run.

Steps to Reproduce

  1. Clone the InvenioRDM REST API Example repository and edit Upload.py with the server location and API access token.
  2. Edit record.json to add the sizes, formats, locations, funding, and references fields. I copied the examples from the Metadata page in the documentation, without the enclosing braces seen in the documentation because I understand those to be the enclosing braces of the metadata section, into the metadata section of the file, and I tried placing the both after the version field and alphabetically within the metadata section on separate attempts.
  3. Run Upload.py
  4. Log into InvenioRDM and view the dashboard to find an unpublished record
  5. Edit the record and attempt to publish it

Observed Behavior

There was an internal error (and the record was not saved). Unknown field: Invalid scheme.

This would seem to indicate the API is not yet ready for these fields to be used and the documentation is incorrect.

Expected behavior

The fields to be available via the API as the documentation indicates they are. The record to successfully publish, and not as a metadata-only record.

Additional context

I have attempted this on both my institution's test InvenioRDM 8.0 instance as well as the one at https://inveniordm.web.cern.ch/, I observe the same behavior on both systems.

marijane commented 2 years ago

Here's my modified JSON file. (In TXT format because GitHub doesn't accept JSON file format) record.txt

fenekku commented 2 years ago

Hi @marijane thanks for trying out InvenioRDM v8!

The JSON sent should be slightly different than the one in record.txt. It should have the funding, locations, formats in the metadata:

{
  "access": {
    "files": "public",
    "record": "public"
  },
  "files": {
    "enabled": true
  },
  "metadata": {
    "creators": [
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "van de Sandt",
          "given_name": "Stephanie",
          "identifiers": [
            {
              "identifier": "0000-0002-9576-1974",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Nielsen",
          "given_name": "Lars Holm",
          "identifiers": [
            {
              "identifier": "0000-0001-8135-3489",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Ioannidis",
          "given_name": "Alexandros",
          "identifiers": [
            {
              "identifier": "0000-0002-5082-6404",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "name": "American Astronomical Society"
          }
        ],
        "person_or_org": {
          "family_name": "Muench",
          "given_name": "August",
          "identifiers": [
            {
              "identifier": "0000-0003-0666-6367",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "name": "Harvard-Smithsonian Center for Astrophysics"
          }
        ],
        "person_or_org": {
          "family_name": "Henneken",
          "given_name": "Edwin",
          "identifiers": [
            {
              "identifier": "0000-0003-4264-2450",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "name": "Harvard-Smithsonian Center for Astrophysics"
          }
        ],
        "person_or_org": {
          "family_name": "Accomazzi",
          "given_name": "Alberto",
          "identifiers": [
            {
              "identifier": "0000-0002-4110-3511",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Bigarella",
          "given_name": "Chiara",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Gonzalez Lopez",
          "given_name": "Jose Benito",
          "identifiers": [
            {
              "identifier": "0000-0002-0816-7126",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Dallmeier-Tiessen",
          "given_name": "S\u00fcnje",
          "identifiers": [
            {
              "identifier": "0000-0002-6137-2348",
              "scheme": "orcid"
            }
          ],
          "type": "personal"
        }
      }
    ],
    "description": "<p>Data and software citations are crucial for the transparency of research results and for the transmission of credit. But they are hard to track, because of the absence of a common citation standard. As a consequence, the FORCE11 recently proposed data and software citation principles as guidance for authors.&nbsp;</p><p>Zenodo is recognized for the implementation of DOIs for software on a large scale. The minting of complementary DOIs for the version and concept allows measuring the impact of dynamic software. This article investigates characteristics of 5,456 citations to Zenodo data and software that were captured by the Asclepias Broker in January 2019. We analyzed the current state of data and software citation practices and the quality of software citation recommendations with regard to the impact of recent standardization efforts. Our findings prove that current citation practices and recommendations do not match proposed citation standards. We consequently suggest practical first steps towards the implementation of the software citation principles.</p>",
    "identifiers": [
      {
        "identifier": "arXiv:1911.00295",
        "scheme": "arxiv"
      }
    ],
    "publication_date": "2019-11-01",
    "publisher": "arXiv",
    "references": [
      {
        "reference": "Nielsen et al,..",
        "identifier": "10.1234/foo.bar",
        "scheme": "doi"
      }
    ],
    "resource_type": {
      "id": "publication-preprint"
    },
    "rights": [
      {
        "id": "cc-by-4.0"
      },
      {
        "title": {
          "en": "Copyright (C) 2019 CERN, AAS and Harvard CfA"
        }
      }
    ],
    "sizes": [
      "11 pages"
    ],
    "subjects": [
      {
        "subject": "Software citation"
      },
      {
        "subject": "Zenodo"
      },
      {
        "subject": "FAIR principles"
      },
      {
        "subject": "Digital Libraries"
      }
    ],
    "title": "Practice meets Principle: Tracking Software and Data Citations to Zenodo DOIs",
    "version": "v1",
    "formats": [
      "application/pdf"
    ],
    "funding": [
      {
        "funder": {
          "name": "European Commission",
          "identifier": "00k4n6c32",
          "scheme": "ror"
        },
        "award": {
          "title": "OpenAIRE",
          "number": "246686",
          "identifier": ".../246686",
          "scheme": "openaire"
        }
      }
    ],
    "locations": {
      "features": [
        {
          "geometry": {
            "type": "Point",
            "coordinates": [
              6.05,
              46.23333
            ]
          },
          "identifiers": {
            "geonames": "2661235",
            "tgn": "http://vocab.getty.edu/tgn/8703679"
          },
          "place": "CERN",
          "description": "Invenio birth place."
        }
      ]
    }
  },
  "pids": {}
}

Even then, the warning on top of the documentation for these metadata fields should probably more clear in its intent. What was really meant was that these fields might be unstable until they are added to the deposit page. That being said, try with this fixed json and see what errors the API returns, if any, now. The API should provide an "errors" field with descriptive information. This way we can hone in on the issue.

Thanks for reporting!

marijane commented 2 years ago

@fenekku Thanks! I see now that the specific JSON I shared was kind of mangled, oops, I've been trying lots of things and shared the wrong version, and I can see my attempt to insert fields alphabetically went awry. On my first attempt, however, I put them all at the end of the metadata section after the version field, and it still didn't work (that's what led me to try alphabetizing).

The one you've shared here still has the same problem: AssertionError: Failed to publish record (code: 400) when the script runs, an unpublished record in my Dashboard, and an "Invalid schema" error when I try to publish it.

fenekku commented 2 years ago

Ah ok. Can you print out what you are getting by adding to the script before line 43:

print(json.dumps(r.json(), sort_keys=True, indent=2, separators=(',', ': ')))

Looking at the "errors" key in the json there should tell us what went wrong. :detective:

marijane commented 2 years ago

Here it is.

{
  "access": {
    "embargo": {
      "active": false,
      "reason": null
    },
    "files": "public",
    "record": "public",
    "status": "metadata-only"
  },
  "created": "2022-05-12T04:29:49.243240+00:00",
  "errors": [
    {
      "field": "metadata.locations.features.0.identifiers",
      "messages": [
        "Not a valid list."
      ]
    },
    {
      "field": "metadata.references.0.scheme",
      "messages": [
        "Invalid scheme."
      ]
    }
  ],
  "expires_at": "2022-05-12 04:29:49.243287",
  "files": {
    "enabled": true,
    "order": []
  },
  "id": "td2n4-jf692",
  "is_draft": true,
  "is_published": false,
  "links": {
    "access_links": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/access/links",
    "files": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/draft/files",
    "publish": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/draft/actions/publish",
    "record": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692",
    "record_html": "https://inveniordm.web.cern.ch/records/td2n4-jf692",
    "reserve_doi": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/draft/pids/doi",
    "review": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/draft/review",
    "self": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/draft",
    "self_html": "https://inveniordm.web.cern.ch/uploads/td2n4-jf692",
    "versions": "https://inveniordm.web.cern.ch/api/records/td2n4-jf692/versions"
  },
  "metadata": {
    "creators": [
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "van de Sandt",
          "given_name": "Stephanie",
          "identifiers": [
            {
              "identifier": "0000-0002-9576-1974",
              "scheme": "orcid"
            }
          ],
          "name": "van de Sandt, Stephanie",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Nielsen",
          "given_name": "Lars Holm",
          "identifiers": [
            {
              "identifier": "0000-0001-8135-3489",
              "scheme": "orcid"
            }
          ],
          "name": "Nielsen, Lars Holm",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Ioannidis",
          "given_name": "Alexandros",
          "identifiers": [
            {
              "identifier": "0000-0002-5082-6404",
              "scheme": "orcid"
            }
          ],
          "name": "Ioannidis, Alexandros",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "name": "American Astronomical Society"
          }
        ],
        "person_or_org": {
          "family_name": "Muench",
          "given_name": "August",
          "identifiers": [
            {
              "identifier": "0000-0003-0666-6367",
              "scheme": "orcid"
            }
          ],
          "name": "Muench, August",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "name": "Harvard-Smithsonian Center for Astrophysics"
          }
        ],
        "person_or_org": {
          "family_name": "Henneken",
          "given_name": "Edwin",
          "identifiers": [
            {
              "identifier": "0000-0003-4264-2450",
              "scheme": "orcid"
            }
          ],
          "name": "Henneken, Edwin",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "name": "Harvard-Smithsonian Center for Astrophysics"
          }
        ],
        "person_or_org": {
          "family_name": "Accomazzi",
          "given_name": "Alberto",
          "identifiers": [
            {
              "identifier": "0000-0002-4110-3511",
              "scheme": "orcid"
            }
          ],
          "name": "Accomazzi, Alberto",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Bigarella",
          "given_name": "Chiara",
          "name": "Bigarella, Chiara",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Gonzalez Lopez",
          "given_name": "Jose Benito",
          "identifiers": [
            {
              "identifier": "0000-0002-0816-7126",
              "scheme": "orcid"
            }
          ],
          "name": "Gonzalez Lopez, Jose Benito",
          "type": "personal"
        }
      },
      {
        "affiliations": [
          {
            "id": "01ggx4157",
            "name": "European Organization for Nuclear Research"
          }
        ],
        "person_or_org": {
          "family_name": "Dallmeier-Tiessen",
          "given_name": "S\u00fcnje",
          "identifiers": [
            {
              "identifier": "0000-0002-6137-2348",
              "scheme": "orcid"
            }
          ],
          "name": "Dallmeier-Tiessen, S\u00fcnje",
          "type": "personal"
        }
      }
    ],
    "description": "<p>Data and software citations are crucial for the transparency of research results and for the transmission of credit. But they are hard to track, because of the absence of a common citation standard. As a consequence, the FORCE11 recently proposed data and software citation principles as guidance for authors.&nbsp;</p><p>Zenodo is recognized for the implementation of DOIs for software on a large scale. The minting of complementary DOIs for the version and concept allows measuring the impact of dynamic software. This article investigates characteristics of 5,456 citations to Zenodo data and software that were captured by the Asclepias Broker in January 2019. We analyzed the current state of data and software citation practices and the quality of software citation recommendations with regard to the impact of recent standardization efforts. Our findings prove that current citation practices and recommendations do not match proposed citation standards. We consequently suggest practical first steps towards the implementation of the software citation principles.</p>",
    "formats": [
      "application/pdf"
    ],
    "funding": [
      {
        "award": {
          "identifier": ".../246686",
          "number": "246686",
          "scheme": "openaire",
          "title": "OpenAIRE"
        },
        "funder": {
          "identifier": "00k4n6c32",
          "name": "European Commission",
          "scheme": "ror"
        }
      }
    ],
    "identifiers": [
      {
        "identifier": "arXiv:1911.00295",
        "scheme": "arxiv"
      }
    ],
    "locations": {
      "features": [
        {
          "description": "Invenio birth place.",
          "geometry": {
            "coordinates": [
              6.05,
              46.23333
            ],
            "type": "Point"
          },
          "place": "CERN"
        }
      ]
    },
    "publication_date": "2019-11-01",
    "publisher": "arXiv",
    "references": [
      {
        "identifier": "10.1234/foo.bar",
        "reference": "Nielsen et al,..",
        "scheme": "doi"
      }
    ],
    "resource_type": {
      "id": "publication-preprint",
      "title": {
        "en": "Preprint"
      }
    },
    "rights": [
      {
        "description": {
          "en": "The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited."
        },
        "icon": "cc-by-icon",
        "id": "cc-by-4.0",
        "props": {
          "scheme": "spdx",
          "url": "https://creativecommons.org/licenses/by/4.0/legalcode"
        },
        "title": {
          "en": "Creative Commons Attribution 4.0 International"
        }
      },
      {
        "title": {
          "en": "Copyright (C) 2019 CERN, AAS and Harvard CfA"
        }
      }
    ],
    "sizes": [
      "11 pages"
    ],
    "subjects": [
      {
        "subject": "Software citation"
      },
      {
        "subject": "Zenodo"
      },
      {
        "subject": "FAIR principles"
      },
      {
        "subject": "Digital Libraries"
      }
    ],
    "title": "Practice meets Principle: Tracking Software and Data Citations to Zenodo DOIs",
    "version": "v1"
  },
  "parent": {
    "access": {
      "links": [],
      "owned_by": [
        {
          "user": 29
        }
      ]
    },
    "communities": {},
    "id": "dbm2h-2m912"
  },
  "pids": {},
  "revision_id": 4,
  "updated": "2022-05-12T04:29:49.332446+00:00",
  "versions": {
    "index": 1,
    "is_latest": false,
    "is_latest_draft": true
  }
}
fenekku commented 2 years ago

Thanks! The issue has to do with the locations and references. The documentation was out-of-date. I fixed the documentation here: https://github.com/inveniosoftware/docs-invenio-rdm/pull/340 .

Locations should be more like:

{
  "locations": {
    "features": [{
      "geometry": {
        "type": "Point",
        "coordinates": [6.05, 46.23333]
      },
      "identifiers": [{
        "scheme": "geonames",
        "identifier": "2661235"
      }],
      "place": "CERN",
      "description": "Invenio birth place."
    }],
  },
}

by default only wikidata and geonames are supported.

and references should be more like:

{
  "references": [{
      "reference": "Nielsen et al,..",
      "identifier": "10.1234/foo.bar",
      "scheme": "other"
  }]
}

DOI is not supported. I am not sure if it's an oversight or not.

marijane commented 2 years ago

Excellent! I just tested it out on our instance and it works! I'm glad I was able to help keep the documentation up-to-date.