MPDL / dataverse

Open source research data repository software
http://dataverse.org
Other
1 stars 0 forks source link

Decoding problems in language vocabulary #30

Closed hofmannc closed 1 year ago

hofmannc commented 2 years ago

Server: prod-edmond2 Testdate: 14.02.2022 Browser: ff Version: Dataverse v. 5.8 User: Admin Action: edit metadata of data set -> "select language:

grafik

Result: Decoding problems in some list items

helkv commented 2 years ago

Wrong encoded languages values must have been imported to the server during configuration of the metadata/controlledvocabulary. The cause of the wrong encoding is unknown.

The wrong encoded language values are duplicates of already existing correct encoded language values.

Fix: Removed the wrong encoded language values (from DB). => List of languages (controlledvocabulary) should now be correct & without wrong encoded values.

helkv commented 2 years ago

Updating the Metadata/Controlledvocabulary via API call api/admin/datasetfield/load (sending the local citation.tsv configuration), results in the observed incorrect encoded languages in the vocabulary.

TODO: Discover why the languages are wrong encoded.

helkv commented 1 year ago

For similar Problems see: https://github.com/IQSS/dataverse/issues/5234, https://github.com/IQSS/dataverse/issues/6675 and https://github.com/IQSS/dataverse/issues/6717

Solution: Adding <jvm-options>-Dfile.encoding=UTF8</jvm-options> to domain.xml and restart Payara fixes the Problem (as described here: https://github.com/IQSS/dataverse/issues/5234#issuecomment-434276254).

helkv commented 1 year ago

Testserver: qa-edmond2.mpdl.mpg.de Browser: ff Version: v5.12.1-mpdl-1 (https://github.com/MPDL/dataverse/commit/3dce49cda89399e4bcdb173e40aa6e553c1db1eb) User: Admin Result: OK

No more encoding issues in the controlled vocabulary when sending the local citation.tsv configuration via API call api/admin/datasetfield/load