IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
873 stars 482 forks source link

Feature Request/Idea: adding custom controlled vocabulary JavaScript script #8722

Closed ErykKul closed 1 year ago

ErykKul commented 2 years ago

Overview of the Feature Request We can have controlled vocabulary as a list locally (with optionally translated values). But if the list is large and needs to be maintained, it is more advantageous to have, for example, a lookup functionality that allows to search for the values at an external service. We can have external controlled vocabularies with "skosmos" protocol, but this is an overkill for a simple list (enumeration) for one field (e.g., author name using author lookup). A more desirable solution is to allow a custom JavaScript to control values of specific fields. A custom script can have all the needed functionality, we only need a way to load it when it is needed.

What kind of user is the feature intended for? Developer

What inspired the request? Our own author lookup service that is now configured as an external controlled vocabulary. However, we do not have URIs, and we see many warnings in the log as the code is trying to resolve the values, while it is not supported by a simple list.

What existing behavior do you want changed? None. We want only the possibility to add custom scripts.

pdurbin commented 2 years ago

A custom script can have all the needed functionality, we only need a way to load it when it is needed.

@ErykKul do you have ideas on how to load it? That is, are you considering making a pull request? 😄

ErykKul commented 2 years ago

@pdurbin I am going to make a pull request. What I am working on now: add a setting :ControlledVocabularyCustomJavaScript and in DatasetFieldServiceBean getVocabScripts method check for the value; if set, then add the custom script to the list.

ErykKul commented 2 years ago

Most work is documenting this.

qqmyers commented 2 years ago

FWIW: There is nothing in the current mechanism that limits you to using skosmos. The mechanism does try to address several things that were/are considered important that are hard to do with just a javascript and a list:

Also note that https://github.com/gdcc/dataverse-external-vocab-support/issues/10 is one probable source of log warnings.

ErykKul commented 2 years ago

@qqmyers Our implementation is not skosmos, it is extremely simple lookup. We simply try to find people from our organization using the few letters of their name. This means we have no translations or any other functionality as foreseen for controlled vocabulary. The biggest problem we had with the controlled vocabulary implementation is the lack of URI identifying a person in our organization. We have used then an empty string ("") as retrieval-uri, which gives problems when the code tries to resolve the URI that does not exist. We had used this config (also notice that term-uri-field is not a URI field at all): [ { "field-name": "authorName", "term-uri-field": "authorName", "js-url": "/covoc/js/covoc.js", "protocol": "covoc-author", "retrieval-uri": "", "allow-free-text": true, "languages": "", "vocabs": "", "managed-fields": {}, "retrieval-filtering": {} } ]

The only thing we really need from that is the JavaScript for the lookup:

image

Making controlled vocabularies work without URI would be very impactful on the code, while the only goal is to allow a custom script in our case. I have searched for the solution that requires the least changes and fixes our problem.

Kris-LIBIS commented 2 years ago

@qqmyers I am stepping in as I asked Eryk to look for an alternative to using the external vocabularies.

First of all maintaining an external vocabulary for the simple lookup lists that we envision is largely overkill. There is nobody in the RDM support team that will be able to maintain such a system. While the external vocabularies integration using SKOSMOS is very valuable, the problem we want to solve is very different:

The use case is for a more dynamic version of the controlled vocabularies in the metadata blocks without the need of the things you mentioned. Some of these are prone to frequent updates (e.g. department list and structure) and/or are just way to large to be listed in the metadata block (e.g. user lists, list of publications, ...). We know from tests that modifying controlled vocabularies in a metadata block is not well supported in Dataverse, so we looked for a solution to keep the vocabulary list out of the metadata block, but at the same time equally simple.