Open jggautier opened 6 years ago
I guess it just got boiled down (probably by me) in the notes for the 2017-11-07 Dataverse Community Call at https://groups.google.com/d/msg/dataverse-community/d7BnFiRA3_I/O6cYqSBfCAAJ as "Translating just interface vs. interface plus metadata itself" but what I was trying to say is that in the fork of Dataverse at http://opendata.pku.edu.cn you can download the metadata itself in both English and Chinese. Here are the two views when you click the language toggle in the upper right:
Here's the dataset above: http://opendata.pku.edu.cn/dataset.xhtml?persistentId=doi:10.18170/DVN/XQAM6B
There's some code at https://github.com/pengchengluo/Peking-University-Open-Research-Data-Platform but it's just a single commit. It was forked before Dataverse 4.0 was tagged.
The challenge, of course, with translating metadata is that we don't know what it'll be. So we'd either have to expect the depositors to do the translation (that is what Peking does), or use some service like Google translate.
Don't we allow non-English languages for dataset metadata these days? I can't find the PR that allowed it but this one is related:
i18n display of CVV values exist and users can specify which language they are entering metadata in (from a list of allowed languages set for the repo), but we do not have a way for users to add translations, i.e. to provide a description in French and English, which I think was part of this issue.
@qqmyers thanks. You helped me find the setting: https://guides.dataverse.org/en/5.11.1/installation/config.html#allowing-the-language-used-for-dataset-metadata-to-be-specified
I guess what this issue is about is supporting multiple languages for the same dataset metadata, like the Chinese/English example above in the fork.
Right now, we only support one language per dataset. At least people can indicate which language it is with the setting described above.
2024/09/09: Keeping.
We should explore adding a new "translate metadata" external tool that could leverage LLM models to auto create tranlsatinos (with the requesite "this was AI generated" warning :)
A group interested in Dataverse would like to be able to download metadata in other languages (a little more context in this Dataverse RT ticket). This issue is a placeholder ticket to capture that interest while more information is gathered. The issue is about user-generated text, such as what users enter into free text metadata fields. It's different than the internationalization discussions (https://github.com/IQSS/dataverse/issues/209) and work that's happened so far, since those are more about being able to provide translations for static text (metadata fields and field labels in the UI, validation text and messages, maybe controlled vocabulary terms, etc.).