IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
878 stars 490 forks source link

As a researcher, I want to be able to download metadata in other languages so that it's easier to understand and reuse #4327

Open jggautier opened 6 years ago

jggautier commented 6 years ago

A group interested in Dataverse would like to be able to download metadata in other languages (a little more context in this Dataverse RT ticket). This issue is a placeholder ticket to capture that interest while more information is gathered. The issue is about user-generated text, such as what users enter into free text metadata fields. It's different than the internationalization discussions (https://github.com/IQSS/dataverse/issues/209) and work that's happened so far, since those are more about being able to provide translations for static text (metadata fields and field labels in the UI, validation text and messages, maybe controlled vocabulary terms, etc.).

pdurbin commented 6 years ago

I guess it just got boiled down (probably by me) in the notes for the 2017-11-07 Dataverse Community Call at https://groups.google.com/d/msg/dataverse-community/d7BnFiRA3_I/O6cYqSBfCAAJ as "Translating just interface vs. interface plus metadata itself" but what I was trying to say is that in the fork of Dataverse at http://opendata.pku.edu.cn you can download the metadata itself in both English and Chinese. Here are the two views when you click the language toggle in the upper right:

screen shot 2017-11-30 at 3 29 02 pm

screen shot 2017-11-30 at 3 29 13 pm

Here's the dataset above: http://opendata.pku.edu.cn/dataset.xhtml?persistentId=doi:10.18170/DVN/XQAM6B

There's some code at https://github.com/pengchengluo/Peking-University-Open-Research-Data-Platform but it's just a single commit. It was forked before Dataverse 4.0 was tagged.

scolapasta commented 6 years ago

The challenge, of course, with translating metadata is that we don't know what it'll be. So we'd either have to expect the depositors to do the translation (that is what Peking does), or use some service like Google translate.

pdurbin commented 2 years ago

Don't we allow non-English languages for dataset metadata these days? I can't find the PR that allowed it but this one is related:

qqmyers commented 2 years ago

i18n display of CVV values exist and users can specify which language they are entering metadata in (from a list of allowed languages set for the repo), but we do not have a way for users to add translations, i.e. to provide a description in French and English, which I think was part of this issue.

pdurbin commented 2 years ago

@qqmyers thanks. You helped me find the setting: https://guides.dataverse.org/en/5.11.1/installation/config.html#allowing-the-language-used-for-dataset-metadata-to-be-specified

I guess what this issue is about is supporting multiple languages for the same dataset metadata, like the Chinese/English example above in the fork.

Right now, we only support one language per dataset. At least people can indicate which language it is with the setting described above.

cmbz commented 1 month ago

2024/09/09: Keeping.

scolapasta commented 1 month ago

We should explore adding a new "translate metadata" external tool that could leverage LLM models to auto create tranlsatinos (with the requesite "this was AI generated" warning :)