IQSS / dataverse-pm

Project management issue tracker for the Dataverse Project. Note: Related links and documents may not be public.
https://dataverse.org
0 stars 0 forks source link

Epic: Support the 5 core persistent identifiers in Dataverse #19

Closed mreekie closed 3 months ago

mreekie commented 1 year ago

This deliverable defines our support for the 5 core pids.

For most of the 5, we offer support already.

For the others:

This came out of our meeting on Jan 31st, 2023. We discussed how we are going to define our support for each of these PIDs through a definition of "we support this type of data", "for this type of field"

image

Includes...


Just a reminder that this is a "deliverable" issue. It does not go on the sprint board. It's a group of other issues that represent the delivery of some objective for us to get to or functionality delivered.

mreekie commented 1 year ago

First step is to get with Julian and make sure that breaking this out is the right thing.

mreekie commented 1 year ago

monthly report

In terms of defining support, the most activity is surrounding ROR and Fundref so they will be the easiest to express our support for.

In terms I can understand it, the outcome of the meeting was that there is only so much generalization you can make about these different persistent identifiers (PIDs), so we will need to note our support for each one individually.

Some are general purpose and have the primary intent of providing a globally unique identifier to something, where that "thing" is just about anything.

In other cases, the identifier has an element that typically might go in a form such as a label from a controlled vocabulary that means something specific to a human, but then it may also have additional fields associated with that primary data. e.g. meta data on the meta data. Really meta. I think Orcid is an example where the human readable portion is just one piece. There is also a unique identifier associated with it, and I think perhaps a URL and even other fields.

The unique identifiers cannot be put in a single bucket either. Some are URIs, some are alphanumeric.

The human readable fields will also be internationalized.

In terms of usage in fields within Dataverse, you cannot assume too much either because of the pre-existing datasets. There are many cases where a field intended to be used by one particular PID is instead populated by a different one in existing data.

There are decisions to be made about in what fields on Dataverse forms in the UI that you expect PIDs to be used in. The field/form combination might impact what you display and what you save from form to form. e.g. Some entries may allow for the human readable portion of the information to go on the form, while a unique identifier and other data can be added to the DV database without being shown. In other cases, there will be no "extra" fields and only the human readable text will be saved to the form and to the database.

Which PIDs can interact with which fields on DV forms, can possibly be controlled by extending an existing system that uses JSON files to do the mapping.

mreekie commented 1 year ago

There is general agreement that we have "good" support for:

cmbz commented 1 year ago

Develop a plan to address the work needed to move this epic forward. Note: re-architecture work should take into account what is planned here (e.g., process used to change UI for metadata fields)

jggautier commented 1 year ago

A Google Slide at https://docs.google.com/presentation/d/1PtqmEzAamuM2__V8psOIetgNODPQxjqSEOuxL3kAV-Y summarizes what support means and which types of metadata are and aren't supported in some way. I'm hoping this helps scope the work.

cmbz commented 1 year ago

2023/09/25: Will require a proposal to scope the investigation and identify a set of tasks. We will need to split this issue into some sub-issues that can be sized, prioritized, and implemented. See issue: https://github.com/IQSS/dataverse.harvard.edu/issues/230

cmbz commented 10 months ago

2024/01/08

jggautier commented 3 months ago

The NIH-GREI's metadata subcommittee and the Dataverse UX WG have done work to address the tasks outlined at https://github.com/IQSS/dataverse-pm/issues/19#issuecomment-1530046654, are continuing to work on those tasks, and that work is being tracked in other GitHub issues.

I'm going to close this GitHub issue.

cmbz commented 3 months ago

@jggautier are these the additional issues that you/UX working group are addressing? If not, could you add them to the epic, please, and add a description to the issue itself? Thanks.

jggautier commented 3 months ago

Hi @cmbz. Yeah the UX WG is planning to address part of https://github.com/IQSS/dataverse-pm/issues/127 during its second design sprint about improving metadata about research objects that are related to datasets, like related journal publications that often have DOIs and Handles.

I think other parts of extending and enhancing support for PIDs and Handles (https://github.com/IQSS/dataverse-pm/issues/195) include PIDs for dataset versions and for funding awards and grants (as opposed to funding organizations). In the PR at https://github.com/IQSS/dataverse/pull/9462 I see discussion and work about PIDs for dataset versions. I think PIDs for funding awards and grants requires more discussion and hasn't been a priority for the NIH-GREI groups and the Dataverse UX WG.

The first design sprint that the Dataverse UX WG is in the middle of now involves improving support of RORs and ORCIDs for describing people and organizations associated with datasets. I've been using the GitHub issue at https://github.com/IQSS/dataverse-pm/issues/127 and the "Epics and Issues" listed there to describe that work.