ropensci / EML

Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
https://docs.ropensci.org/EML
Other
98 stars 33 forks source link

Internationalization - Metadata in multiple languages #300

Closed DrMattG closed 4 years ago

DrMattG commented 4 years ago

Chapter 8 of the EML Specification (https://eml.ecoinformatics.org/internationalization-metadata-in-multiple-languages.html) gives a solution to adding metadata in multiple languages. I can not work out how to replicate this in R though. Is this possible?

The example in the chapter is: ` <?xml version="1.0"?> <eml:eml packageId="eml.1.1" system="knb" xml:lang="pt_BR" xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0 eml.xsd">

  <dataset id="ds.1">

    <!-- English title with Portuguese translation -->    
    <title xml:lang="en-US">
        Sample Dataset Description
        <value xml:lang="pt-BR">Exemplo Descrição Dataset</value>
    </title>
    ...
    <!-- Portuguese abstract with English translation -->    
    <abstract>
        <para>
            Neste exemplo, a tradução em Inglês é secundário
            <value xml:lang="">In this example, the English translation is secondary</value>
        </para>
    </abstract>
    ...
    <!-- two keywords, each with an equivalent translation -->    
    <keywordSet>
        <keyword keywordType="theme">
            árvore
            <value xml:lang="en-US">tree</value>
        </keyword>
        <keyword keywordType="theme">
            água
            <value xml:lang="en-US">water</value>
        </keyword>
    </keywordSet>
    ...
  </dataset>
</eml:eml>`
amoeba commented 4 years ago

Hey @DrMattG, thanks for reaching out with your question. I actually didn't know if this was possible and was not expecting it to work but it does.

Here's a basic example showing a multi-lingual title element:

library(EML)

me <- list(individualName = list(givenName = "Carl", surName = "Boettiger"))
my_eml <- list(dataset = list(
  title = list("lang" = "en-US",
               "A Minimal Valid EML Dataset",
               value = list("lang" = "pt-BR", 
                            "Exemplo Descrição Dataset")),
  creator = me,
  contact = me)
)

produces

<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="5bc0f57b-ce0f-4303-b495-7aaf6e6299f3" system="uuid" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd">
  <dataset>
    <title xml:lang="en-US">A Minimal Valid EML Dataset
      <value xml:lang="pt-BR">Exemplo Descrição Dataset</value>
    </title>
    <creator>
      <individualName>
        <givenName>Carl</givenName>
        <surName>Boettiger</surName>
      </individualName>
    </creator>
    <contact>
      <individualName>
        <givenName>Carl</givenName>
        <surName>Boettiger</surName>
      </individualName>
    </contact>
  </dataset>
</eml:eml>

Attributes versus character/element data is a bit odd because of EML's list-based internal representation so attributes have to placed as list members (see where I set a "lang" named list member. I don't know why this works but I can take a look next week some time.

In the mean time, can you give that a try and let me know if it makes sense and works?

DrMattG commented 4 years ago

@amoeba That is great and works perfectly, thank you very much! We are just starting out on a new project that will lean heavily on the library(EML) [Living Norway project https://github.com/LivingNorway]. Being able to include English translations of Norwegian titles and eventually/hopefully abstracts at an early stage in the R workflow will make things so much easier. Thanks again, especially for the prompt reply. We appreciate the help. Tusen Takk!