inveniosoftware / invenio-app-rdm

Turn-key research data management platform.
https://inveniordm.docs.cern.ch
MIT License
108 stars 148 forks source link

Vocabularies: 'id' Field Requirement for Names Schema in InvenioRDM #2545

Closed Samk13 closed 8 months ago

Samk13 commented 11 months ago

Package version (if known): V12dev > 25

Describe the bug

The 'id' field requirement within the names schema in InvenioRDM is either an intentional change or a bug. When manually adding names through instance setup or command line, these entries lack an 'id' field leading to validation error that was not the case before V12dev25.

The Question:

Steps to Reproduce

1- on a fresh v12 latest using app_data/vocabularies/names.yaml

- affiliations:
  - name: University of Zurich
  - name: Humboldt University of Berlin
  - name: Kaiser Wilhelm Institute for Physics
  family_name: Einstein # <-- This will not be imported
  given_name: Albert
  identifiers:
  - identifier: gnd:118529579
    scheme: gnd
- affiliations:
  - name: University of Cambridge
  - name: California Institute of Technology
  - name: University of Oxford
  family_name: Hawking
  given_name: Stephen
  id: 0000-0002-9079-593X # <-- this will be imported
  identifiers:
  - identifier: https://orcid.org/0000-0002-9079-593X
    scheme: orcid

2- Import names using the command: invenio vocabularies import --vocabulary names --filepath app_data/vocabularies-future.yaml

3- Observe the behavior: names with an 'id' field (e.g., Stephen Hawking) are imported, while those without an 'id' (e.g., Albert Einstein) are not with validation error.

Expected behavior

Names should be imported consistently, regardless of the presence of an 'id' field in the YAML file.

Screenshots (if applicable)

This is before and after adding idfield to each names vocabularies: Screenshot 2023-12-02 152923

Additional context

This problem seems to have arisen in versions after V12dev25. specifically, after this big refactor: https://github.com/inveniosoftware/invenio-vocabularies/commit/ba8d3c632de625a29206e4a369a7468fe5c1b2e1

I did not test other ways of importing names like ORCiD public dataset or other ways...

github-actions[bot] commented 9 months ago

This issue was automatically marked as stale.

tmorrell commented 9 months ago

I suspect this was an intended change, since previously it was difficult to figure out how to update name entries if multiple identifiers were present.

The update functionality still doesn't work from invenio vocabularies update, but I think that's a separate bug.

Samk13 commented 9 months ago

@tmorrell Could you share the issue link for the invenio vocabularies update, or if it doesn't exist, could you create one for this matter on how to reproduce?

tmorrell commented 9 months ago

Just added the issue: https://github.com/inveniosoftware/invenio-vocabularies/issues/292