Closed graybeal closed 3 years ago
In response to my email below, our user gave us this report.
Thank you for working with us on this. We are unable to duplicate this behavior when we use the system, so I'd like to ask a few more questions. I hope you don't mind.
I found this metadata instance, is it the one you are working with? https://cedar.metadatacenter.org/instances/edit/https://repo.metadatacenter.org/template-instances/bdb39dc3-00eb-4775-b893-87e1c97c924c
Did you create the instance by typing the text, or by copy-pasting it? If you copy-pasted it, could you try typing some of the same text manually and see if the problem remains? And perhaps copy your original text back to me, so that I have it in the non-transformed format and can try it out from my computer?
Finally, are you using a Mac or a PC?
We have had just 1 or 2 isolated complaints that were similar and would love to understand what is happening in this case.
I was able to recreate the problem. It happens when I rename a metadata file. In this first image is the original metadata.
Then I renamed it for "Teste acentuação" ( Accentuation test ).
Right after it is renamed, we can see that the new name itself shows the problem we've been talking about.
And then, the metadata itself shows the problem. Fields that have accentuation in their names, such as the first one "Título ( Title )", erase whatever was written. Fields without accentuation have their instances not displayed correctly, as the second "Nome ( Name )".
We have to rename the metadatas for better organization, since we might fill a template more than once. Maybe if we could change its default name before saving, non-ASCII characters would display correctly. All the information was typed, not copy-pasted, using Google Chrome on a PC.
Here is the link for the template: https://cedar.metadatacenter.org/templates/edit/https://repo.metadatacenter.org/templates/13f72344-1f72-45fd-a13b-6dfc78ecbce5?folderId=https:%2F%2Frepo.metadatacenter.org%2Ffolders%2F795db003-ee91-4493-a25a-65c47e1d7ee3
Here is the link for the instance: https://cedar.metadatacenter.org/instances/edit/https://repo.metadatacenter.org/template-instances/ee0ecfdd-61c0-42d0-95d0-23de3ef4df0c?folderId=https:%2F%2Frepo.metadatacenter.org%2Ffolders%2F795db003-ee91-4493-a25a-65c47e1d7ee3
I've fixed this issue. In several places of our resource server and in the cedar core library, we were using EntityUtils.toString without specifying any encoding. In that case, the toString method uses the "ISO-8859-1" encoding by default. I've fixed it by specifying "UTF-8" as the desired encoding.
This fix will be put on production with the next production release.
In the instance https://cedar.metadatacenter.org/instances/edit/https://repo.metadatacenter.org/template-instances/bdb39dc3-00eb-4775-b893-87e1c97c924c the non-ASCII characters are not being displayed correctly. We are unable to duplicate this manually, and are collecting more information about the source of the characters.
The idioma is identified as http://id.loc.gov/vocabulary/iso639-1/pt
(See also https://github.com/metadatacenter/cedar-template-editor/issues/875)