IQSS / dataverse.harvard.edu

Custom code for dataverse.harvard.edu and an issue tracker for the IQSS Dataverse team's operational work, for better tracking on https://github.com/orgs/IQSS/projects/34
5 stars 1 forks source link

Implement Dropdown Selector for NIH Controlled Vocabulary in Keywords #267

Open Saixel opened 5 months ago

Saixel commented 5 months ago

Background

As part of our ongoing collaboration with the CAFE project team, we have identified a need to refine the user experience in selecting controlled vocabulary terms for dataset keywords. This initiative aims to align the Dataverse metadata input process with standard vocabularies and enhance data discoverability and consistency.

Feature Request

Implement a selector (dropdown, box options, widget, etc) to allow users to select and add terms from the NIH controlled vocabulary glossary as keywords.

Current State:

Desired Functionality:

Justification

The CAFE project team requires a more standardized and error-proof method for keyword selection to ensure metadata quality and consistency. This enhancement will support users in accurately tagging datasets, thus facilitating better data curation and searchability.

Implementation Considerations

Additional Context

This request is driven by user feedback and the project's commitment to improving data quality and curation practices within the CAFE project's use of Dataverse. We have already compiled a comprehensive list, which includes each keyword, its description, and the associated URL, ready to be utilized for the selector feature.

pdurbin commented 5 months ago

Related:

However, I checked with @Saixel and he plans to implement this using a custom metadata block for the CAFE project rather than attempting to modify the keyword field in the citation block (which is what the issue above is about).

He said there are almost 300 controlled vocabulary values.

scolapasta commented 4 months ago

First, this should be moved to the harvard dataverse repo, as it should not require any code in the core.

Second, we're wondering about this: "Dynamically populate options within the selector based on the NIH controlled vocabulary glossary"

Is the idea to haver these values read from an existing API? If so we would use the external CV functionality and the best next step woiuld be a spike to use this API and make sure there are not any unexpected behavior. (that spike would likely be a size 10, for someone who already has experience with the external CV functionality) If not, and it's just using our external CV functionality, then all that needs to be done is add the values to the appropriate tsv file, can be sized as a 3.

Saixel commented 4 months ago

Related:

However, I checked with @Saixel and he plans to implement this using a custom metadata block for the CAFE project rather than attempting to modify the keyword field in the citation block (which is what the issue above is about).

He said there are almost 300 controlled vocabulary values.

@pdurbin Thanks for pointing out the related issue. My initial approach was to use a custom metadata block to avoid changing the current keyword block structure. However, I see in the comment in https://github.com/IQSS/dataverse/issues/10288 that a similar case is suggested by implementing an autocomplete function. Our goal is to present a list of options for keyword selection from the prepared terms in a CSV. So either through a dropdown or autocomplete, either option could be a viable solution. If it's okay with you, we can dig deeper into this topic as we work on this implementation.

Saixel commented 4 months ago

First, this should be moved to the harvard dataverse repo, as it should not require any code in the core.

Second, we're wondering about this: "Dynamically populate options within the selector based on the NIH controlled vocabulary glossary"

Is the idea to haver these values read from an existing API? If so we would use the external CV functionality and the best next step woiuld be a spike to use this API and make sure there are not any unexpected behavior. (that spike would likely be a size 10, for someone who already has experience with the external CV functionality) If not, and it's just using our external CV functionality, then all that needs to be done is add the values to the appropriate tsv file, can be sized as a 3.

@scolapasta The issue has been moved to the Harvard Dataverse repo as per your guidance (thanks for pointing this out). Regarding the "Dynamically populate options within the selector based on the NIH controlled vocabulary glossary" feature, I'd like to clarify that we don't have an API. Instead, we have a CSV with a list of almost 300 terms. If we can use the external CV functionality you mentioned for this purpose, I would appreciate any documentation or pointers to existing implementations to explore and test this further.

pdurbin commented 4 months ago

I'd like to clarify that we don't have an API. Instead, we have a CSV with a list of almost 300 terms. If we can use the external CV functionality you mentioned for this purpose

I would recommend playing around with the configuring Author Affiliation to look up from ROR. For config advice, please see https://github.com/IQSS/dataverse/pull/10331#issuecomment-2062303940

That said, this feature depends on an external API (like the ROR API). So you'd need to build and host that API somehow.

It might be easier to use the database and put the 300 values in a controlled vocabulary. But if you have a plan for how to build an API and where to host it, it should be do-able. 😄

Saixel commented 1 month ago

After further discussion, we've decided to expedite the NIH controlled vocabulary integration by creating a new custom metadata block with a dropdown for CCH terms, using our prepared list. This approach will help us avoid the complexity and longer development time of modifying the existing keyword metadata block.