IQSS / dataverse

Open source research data repository software
875 stars 484 forks source link

Auto-completion based on existing database values for the field Keyword. #6154

Open alejandratenorio opened 5 years ago

alejandratenorio commented 5 years ago

Users in CIMMYT need to create a suggested vocabulary list to use them in auto-completion values for the field Keyword. The community has proposed to add a similary functionality to metadata such as Authors or keywords using a lookup that would unfold in a button within the template, as exposed in, unfortunately, we don't know the current state of this development.

Our list of requirements is as follows:

  1. The autocomplete must automatically fill in the fields of:
    • Term
    • b. Vocabulary
    • c. Vocabulary URL
  2. Each term can have a main language and translations.
  3. As administrators, we must manage the controlled vocabularies,
    • a. Add new vocabularies to Dataverse using an easy-to-structure CSV file.
    • b. Edit and delete vocabularies.
    • c. List the controlled vocabularies
    • d. Allow editing a vocabulary: add, edit and delete terms and translates
    • e. Use of a controlled vocabulary from a Global Dataverse to a Sub-Dataverse

We have developed 60% of the requirements; however, the goal of CIMMYT is to contribute with this new module. We would like to share with you the design, tables and UI for feedback. The following are some of the UIs we have For example, image

List of controlled vocabularies of a Dataverse image

Add new vocabulary image

Confirm before creating a vocabulary from a CSV file image

Manage a previously created vocabulary image

Editing a term image

Create a new term image

jggautier commented 5 years ago

Hi @alejandratenorio! Thanks for sharing! You wrote that:

The community has proposed to add a similary functionality to metadata such as Authors or keywords using a lookup that would unfold in a button within the template, as exposed in #4772, unfortunately, we don't know the current state of this development.

About the status of these similar developments, I just wanted to mention that #4772 is part of this larger github issue (larger in scope). In #6030 the thinking, I think, is to consider an approach that takes as many of these use cases and the approaches that teams have proposed and even developed, like yours, into account before changes to the core code are tested and made. This hasn't been done, yet. Just trying to help answer your question about the status of addressing these issues. :)

alejandratenorio commented 5 years ago

Hi @jggautier,

I understand, if we want to craft a code contribution for this issue, we should send you a proposal to resolve the #6030

alejandratenorio commented 5 years ago

@jggautier, For your comments, could we show you our prototype? We would like to develop the global solution but i'm aware that this needs to be discussed

jggautier commented 5 years ago

Yes, I think we'd love to see the prototype. It'll really help move solutions forward. (And thanks for the screenshots!)

Could we get back to you next week? I think what we want to do is exactly what you've been doing by contacting people who are already working on aspects of this issue: Trying to collaborate with everyone who shares similar needs, and might have already forked Dataverse to solve those needs.

alejandratenorio commented 4 years ago


Sure, we can show you the prototype, it's the same that @scolapasta saw. We'd be happy if you give us feedback and tell us the way to go on.

pdurbin commented 4 years ago

@alejandratenorio we have the ability to spin up arbitrary branches from forks of Dataverse if you'd like us to try that. The script we use is documented at

Screen Shot 2019-09-18 at 1 40 13 PM

alejandratenorio commented 4 years ago

@alejandratenorio we have the ability to spin up arbitrary branches from forks of Dataverse if you'd like us to try that. The script we use is documented at

Screen Shot 2019-09-18 at 1 40 13 PM

Hi @pdurbin,

Gerardo did the deploy, dataverseAdmin admin1

pdurbin commented 4 years ago

@alejandratenorio fantastic! When I look at I see dataverse-4.13-18cbbe6 and it looks like you're building from . The odd thing about that repo is that it only has two commits, but I'll worry about that later. 😄

The password works fine but I didn't play with the feature or get any screenshots (I'm about to leave for the day) but @jggautier @TaniaSchlatter @mheppler @djbrooke @scolapasta or others might be interesting in playing around with the feature to get a better understanding of how it works. Thanks! 🎉

djbrooke commented 4 years ago

Hi @alejandratenorio, I got a chance to review this with @jggautier @scolapasta @TaniaSchlatter and @pdurbin earlier today. Thanks again for setting up the AWS instance for us.

I have some concerns about building a UI for the administration of this suggestion feature. While the UI that you've worked on so far may work for CIMMYT users, I'm concerned about the scalability of this interface for others that want to allow suggestions for metadata fields while users are adding or editing datasets. I think the better option would be to have the field suggestions managed through an API.

Generally, what we're thinking is:

There are still some details to be figured out and we're happy to discuss this with you. If managing the suggestions through a UI is a requirement, could we discuss building an external tool that uses the APIs to update the application?

Comments welcome!

alejandratenorio commented 4 years ago

Hi @alejandratenorio, I got a chance to review this with @jggautier @scolapasta @TaniaSchlatter and @pdurbin earlier today. Thanks again for setting up the AWS instance for us.

I have some concerns about building a UI for the administration of this suggestion feature. While the UI that you've worked on so far may work for CIMMYT users, I'm concerned about the scalability of this interface for others that want to allow suggestions for metadata fields while users are adding or editing datasets. I think the better option would be to have the field suggestions managed through an API.

Generally, what we're thinking is:

  • Add table(s) to the application that can be populated with suggestions for metadata fields/compound fields
  • Add an API endpoint that can be used to populate those tables for the root dataverse or a specific dataverse (and possibly the sub-dataverses). We'd need to work with you on determining the format and @jggautier will have some ideas.
  • Use our autocomplete component from Primefaces (@scolapasta can you point to this?) to provide suggestions based on what's in the table

There are still some details to be figured out and we're happy to discuss this with you. If managing the suggestions through a UI is a requirement, could we discuss building an external tool that uses the APIs to update the application?

Comments welcome!

@djbrooke, Could we talk about this on Monday?

djbrooke commented 4 years ago

@alejandratenorio Sure thing. Looking forward to it! Have a good weekend!

stevenmce commented 4 years ago

Hi everyone,

ADA is a definite +1 on adding this feature. I've added it to our project board. We would like to be able to reference an external vocabulary, ideally held at an authorative source, and then autocomplete based on the user text entry.

We would be particularly interested in using directly vocabs on: a) The ARDC vocabulary server: (here is our current vocab on there: b) The CESSDA vocabulary server: (note particularly they have the DDI vocabularies there!!!)

Cheers, Steve

alejandratenorio commented 4 years ago

Hi @djbrooke,

I propose to separate the functionality according to the type of metadata fields. This proposal is designed for compound metadata, such as Keyword or Topic Classification. For metadata fields like Subject or Language @poikilotherm is working on a prototype #6000 • Add a table to define controlled vocabularies and their associated metadata field. • Add tables that can be populated with suggestions for compound fields: term, vocabulary and URL (for each controlled vocabulary) • Add two API endpoint, that can be used to create control vocabulary and their suggestions. • Also, add an admin page to manage these features (for admin users). o As an administrator user, you could upload suggestions one by one or with a CSV file. o We would not read directly a RESTful API because we would have to read different input formats, this would be more complex. Tell me if this is important to you. • Each controlled vocabulary could be used in a specific Dataverse or Sub-dataverse.

What do you think?

Gerafp commented 4 years ago

Hi all!

In CIMMYT we continue with this functionality. Currently, We work in the autocomplete fields task in the dataset form, the user selects a Controlled Vocabulary in a dropdown list and the terms are load in a list for complete other fields. We have the problem that the SelectOneMenu not called the java method in the attribute valueChangeListener this method is use for update the selected vocabulary and later for load a list of terms. This terms are used for an autocomplete field. We see the implementation used when a user created a new Dataset and replicate this, but not work.

Does anyone have any ideas for this problem?

Java bean

Xhtml page

This is an example for develop and testing, the final version is part of dataset form.

djbrooke commented 4 years ago

Hi @alejandratenorio @Gerafp apologies for missing your previous message. Before we start diagnosing the JSF issues, can we meet to discuss the proposal here and get a demo? I know that we tried to get a demo when we last met but we ran out of time after talking through the (now merged and released) MS login feature.

Would 10 AM ET Friday or 3 PM ET Monday work for you?

poikilotherm commented 4 years ago

Just to spread the word, @mheppler posted this answer to #6000. So for now I will not continue to work on it (other prios, too), but test the new component with our usecase. Dunno if #6339 and its solution is related here, too.

mheppler commented 4 years ago

@poikilotherm the new selectCheckboxMenu PrimeFaces component used by Subject and Language citation metadata fields allow multiple selections, so that component provides checkboxes in the select dropdown menu. The Keyword compound citation metadata fields would provide a selectOneMenu PrimeFaces component for Term, as is currently used in the UI for Author and Related Publication compound citation metadata fields for. (See screenshot.)

Screen Shot 2019-11-21 at 9 57 19 AM

You can see this code with the render logic to determine which UI component is displayed for various metadata fields in the metadataFragment.xhtml (code snippet from PR #6356).

<ui:fragment rendered="#{dsf.datasetFieldType.controlledVocabulary}">
    <div class="form-group dataset-field-values">
        <div class="form-col-container col-sm-9 edit-field">
            <p:selectOneMenu value="#{dsf.singleControlledVocabularyValue}" converter="controlledVocabularyValueConverter" style="width: auto !important; max-width:100%; min-width:200px;" styleClass="form-control primitive"
                                 id="unique1" rendered="#{!dsf.datasetFieldType.allowMultiples}">
                <f:selectItem itemLabel="#{}" itemValue="" noSelectionOption="true"/>
                <f:selectItems value="#{dsf.datasetFieldType.controlledVocabularyValues}" var="cvv" itemLabel="#{cvv.localeStrValue}" itemValue="#{cvv}"/>
            <p:selectCheckboxMenu value="#{dsf.controlledVocabularyValues}" multiple="true" converter="controlledVocabularyValueConverter" styleClass="form-control"
                                   filter="true" filterMatchMode="startsWith" label="#{}"
                                   id="unique2" rendered="#{dsf.datasetFieldType.allowMultiples}">
                <f:selectItems value="#{dsf.datasetFieldType.controlledVocabularyValues}" var="cvv" itemLabel="#{cvv.localeStrValue}" itemValue="#{cvv}"/>
            <div class="ui-message ui-message-error ui-widget ui-corner-all" aria-live="polite" jsf:rendered="#{!empty dsf.validationMessage}">
                <span class="ui-message-error-detail">#{dsf.validationMessage}</span>
alejandratenorio commented 4 years ago

Would 10 AM ET Friday or 3 PM ET Monday work for you?

Hey, @djbrooke,

It's okay Monday at 3 PM ET.

djbrooke commented 4 years ago

Thanks @alejandratenorio see you then. Use this information to join:

You can also dial in using your phone. United States: +1 (571) 317-3112 Access Code: 792-074-317

djbrooke commented 4 years ago

Thanks @alejandratenorio and @Gerafp for offering to add a video here and pointing us to any code or template documents!

Gerafp commented 4 years ago

Hi all! Happy 2020 :D We have two videos for show the CIMMYT work in this Issue. The videos are in OneDrive

Prototype 1 -!Al54grRzE57lhjS7fNpSh2NA0ArG Prototype 2 -!Al54grRzE57lhjO1kbD6J6AKUkdz

We look forward to your comments

Good Day

pdurbin commented 4 years ago

@Gerafp thanks! I just added these videos to the top of DataverseTV:

Screen Shot 2020-02-25 at 1 57 53 PM

@4tikhonov these are the videos I told you would be coming. @RightInTwo you might want to check these out. I know lots of people are interested in better controlled vocabulary stuff in Dataverse. 😄

djbrooke commented 4 years ago

Hey @Gerafp, happy new year! Thanks for sharing these videos. We're working on moving forward some big initiatives at the moment (#3404 and #6085) and won't be able to be responsive on this for the next two weeks or so. I'll work with the UI/UX team here to respond to what you've built so far once we have some additional bandwidth. Let me know if you have any concerns.

pdurbin commented 4 years ago

@Gerafp from a quick look this morning, I was glad to see the author/depositor "create dataset" experience with the new controlled vocabulary feature toward the end of the second video. A couple screenshots:

Screenshot from 2020-02-26 07-57-36

Screenshot from 2020-02-26 07-57-46

I also really appreciate the final slide of the second video where the current development status is made clear:

Screenshot from 2020-02-26 07-58-01

alejandratenorio commented 4 years ago

Hi all,

Thank you all for your comments, we are finishing the second prototype, we think it's more functional than the other one. We hope this contribution will be useful and hope your suggestions and comments.

djbrooke commented 4 years ago

Thanks @alejandratenorio, good to hear from you. I'm going to tag @qqmyers as he is working on a document regarding metadata use cases in Dataverse and it would be helpful to discuss your work so far, and perhaps see a demo/video.

Gerafp commented 4 years ago

Hi all!.

In these days, we continue working in this functionality. At present, we have a test version with this implementation, the implementation is based on the second prototype show in the videos. If you want test it, the URL is:

u: dataverseAdmin p:qwerty12

Unfortunately, we experiment some problems when adding the modal for select a term into the "dataset.xhtml" file:






We believe it is an incompatibility with the elements of Primefaces but it would be helpful if you could take a look at it.

The problems are rare, because when we adding the modal into the "template.xhtml", the modal works correctly. The user can create a new template, select a keyword by the modal, set the values and save the values for each template, the same when a user edit a template.

We continue work in this funtion and We look forward to your comments, the code is up to date in this issue branch.

If the server is down you can tell me for start it.


Gerafp commented 4 years ago

Hi all!

Do you have some suggestions or comments about the last update?


djbrooke commented 4 years ago

Hi @Gerafp, good to hear from you. I hope you and the team are well.

@qqmyers @kmika11 and @stevenmce recently contacted Richard F. to include discussion of this work in the Flexible Metadata session at the Dataverse Community Meeting ( We hope that will be a good opportunity to discuss the goals driving this work, and to discuss the work so far.

qqmyers commented 3 years ago

@Gerafp - In trying to start the controlled vocabulary value topic in the Dataverse Metadata working group, I'm wondering if you might want to particpate and perhaps recap your requirements/work relevant to the topic in the first meeting for that topic. Would you be interested? If so, the best way to connect would probably be to follow the process to join the WG and get into the slack space and contact me there (see

pdurbin commented 2 years ago

@alejandratenorio @Gerafp @stevenmce and others, I just want to make sure you're all aware that Dataverse 5.7 has been released and includes a new external controlled vocabulary feature. I'll copy the relevant part of the release note below. I'm hoping you get a chance to try it out and see if it meets your needs!


Experimental Support for External Vocabulary Services

Dataverse can now be configured to associate specific metadata fields with third-party vocabulary services to provide an easy way for users to select values from those vocabularies. The mapping involves use of external Javascripts. Two such scripts have been developed so far: one for vocabularies served via the SKOSMOS protocol and one allowing people to be identified via their ORCID. The guides contain info about the new :CVocConf setting used for configuration and additional information about this functionality. Scripts, examples, and additional documentation are available at the GDCC GitHub Repository.

Please watch the online presentation, read the document with requirements and join the Dataverse Working Group on Ontologies and Controlled Vocabularies if you have some questions and want to contribute.

This functionality was initially developed by Data Archiving and Networked Services (DANS-KNAW), the Netherlands, and funded by SSHOC, "Social Sciences and Humanities Open Cloud". SSHOC has received funding from the European Union’s Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782. It was further improved by the Global Dataverse Community Consortium (GDCC) and extended with the support of semantic search.

pdurbin commented 1 year ago

@alejandratenorio @Gerafp have you tried the external vocabulary feature? Here are the latest docs:

Are you still interested in the issue (#6145)? Thanks for the demos and all the work on it! If external vocabularies work for you, perhaps we can close this issue. Please let us know. Thanks.