INSPIRE-MIF / gp-data-service-linking-simplification

Good Practice on a consensus-based simplified approach for INSPIRE data and service linkages
7 stars 12 forks source link

Part B - Remapping of Extended Capabilities - About Supported languages #43

Closed jescriu closed 1 year ago

jescriu commented 2 years ago

This issue is aimed at discussing proposals about the mapping of the element, which as per Annex B of the consolidated proposal is still pending to be agreed (TBD).

As an initial point of discussion, the mentioned Annex proposes to consider the metadataLanguage in the data set metadata.

Support to multilingualism is a key aspect in INSPIRE.

Please provide your input/proposals on this element in order to reach a consensus by the sub-group.

AntoRot commented 2 years ago

The Implementation Requirement 71 in INSPIRE NS - View Service TG (or the TG Requirement 59 in INSPIRE NS - Download Service TG) reads as follows:

This list of supported languages shall consist of

  1. exact one element indicating the service defaultlanguage, and
  2. zero or more elements to indicate all additional supported languages.

The metadata language element in the data set metadata record could correspond to the <inspire_common:DefaultLanguage> mandatory element.

Concerning the supported languages, the easiest solution could be to not identify additional elements to map to <inspire_common:SupportedLanguage> as, being this element optional, the above requirement would be met.

A more complicated solution could be to use the locale element with the supported languages and the translations in those languages of the title and abstract elements (the two fields affected) through the LocalisedCharacterString element. About this and about the multilingual metadata there is a dedicated section (i.e. A.6) in the INSPIRE metadata TG v. 1.3. suggesting a similar proposal to manage multilingual metadata.

immagine

Here an example (where defaultLanguage= ita and supportedLanguage=eng):

...
<!--defaultLanguage-->
<gmd:language>
      <gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2" codeListValue="ita">ita</gmd:LanguageCode>
   </gmd:language>
...
<!-- supportedLanguages -->
<gmd:locale>
      <gmd:PT_Locale id="locale-en">
         <gmd:languageCode>
            <gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">English</gmd:LanguageCode>
         </gmd:languageCode>
         <gmd:characterEncoding>
            <gmd:MD_CharacterSetCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/codelist/gmxCodelists.xml#MD_CharacterSetCode" codeListValue="utf8">utf8</gmd:MD_CharacterSetCode>
         </gmd:characterEncoding>
      </gmd:PT_Locale>
   </gmd:locale>
...
<gmd:title xsi:type="PT_FreeText_PropertyType">
                  <gco:CharacterString>Zone di applicazione delle misura di contenimento del contagio da COVID-19 (RNDT Dataset)</gco:CharacterString>
                  <gmd:PT_FreeText>
                     <gmd:textGroup>
                        <gmd:LocalisedCharacterString locale="#locale-en">Actions's implementation zone for containing COVID-19 contagion (RNDT Dataset)</gmd:LocalisedCharacterString>
                     </gmd:textGroup>
                  </gmd:PT_FreeText>
               </gmd:title>
...
<gmd:abstract xsi:type="PT_FreeText_PropertyType">
            <gco:CharacterString>Porzione di territorio o uno o più unità amministrative (province, regioni o intero territorio nazionale), anche diverse tra loro, per le quali sono state adottate misure di contenimento ai fini della prevenzione rivolte alla popolazione e con esclusione del personale sanitario, delle forze di polizia, del corpo nazionale dei vigili del fuoco e delle forze armate, nell'esercizio delle loro funzioni. Le misura di contenimento sono differenziate per estensione e gravità. Tra le misura previste nell'art.1 c) è il divieto di allontanamento e il divieto di accesso</gco:CharacterString>
            <gmd:PT_FreeText>
               <gmd:textGroup>
                  <gmd:LocalisedCharacterString locale="#locale-en">Portion of territory or one or more administrative units (provinces, regions or entire national territory), also different from each other, for which containment actions have been adopted for the purpose of prevention aimed at the population and with the exclusion of health personnel, police, the national fire department and the armed forces, in the exercise of their functions. The containment measures are differentiated by extension and severity. Among the actions provided for in art.1 is the ban on removal and the ban on access.</gmd:LocalisedCharacterString>
               </gmd:textGroup>
            </gmd:PT_FreeText>
         </gmd:abstract>

We used those elements in some metadata records conformant with the INSPIRE metadata TG v. 1.3, but unfortunately when we tried to use them in the records conformant to the latest version 2.0 of metadata TG we experienced that some errors were returned from the validation against the XSD schema (maybe due to the contextual presence in the record of PT_FreeText elements and gmx:Anchor elements, but this is still an open issue for us).

Consequently, in my opinion the only solution that we can follow is the simpler one proposed above ;)

heidivanparys commented 2 years ago

For WFS, note that some functionality for multilingualism in the ServiceIdentification part of the GetCapabilities is foreseen, see the OGC Web Services Common Specification

image

   <ows:ServiceIdentification>
      <ows:Title xml:lang="en">My WFS</ows:Title>
      <ows:Title xml:lang="da">Min WFS</ows:Title>
      <ows:Abstract xml:lang="en">My abstract</ows:Abstract>
      <ows:Abstract xml:lang="da">Min abstrakt</ows:Abstract>
      <ows:Keywords>
        <ows:Keyword xml:lang="en">My keyword</ows:Keyword>
        <ows:Keyword xml:lang="da">Mit emneord</ows:Keyword>
      </ows:Keywords>
      <ows:ServiceType>WFS</ows:ServiceType>
      <ows:ServiceTypeVersion>2.0.2</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>2.0.1</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>2.0.0</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>1.1.0</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>1.0.0</ows:ServiceTypeVersion>
      <ows:Fees>NONE</ows:Fees>
      <ows:AccessConstraints>NONE</ows:AccessConstraints>
   </ows:ServiceIdentification>

I don't know whether servers actually support this. But extracting the xml:lang values could perhaps be an option for the supported languages.

The Atom format also supports the xml:lang attribute.

Note that in the Good Practice for OGC API Features, multilingualism is seen as optional, see also https://github.com/INSPIRE-MIF/gp-ogc-api-features/blob/master/spec/oapif-inspire-download.md#req-multilinguality.

jescriu commented 2 years ago

@AntoRot,

We used those elements in some metadata records conformant with the INSPIRE metadata TG v. 1.3, but unfortunately when we tried to use them in the records conformant to the latest version 2.0 of metadata TG we experienced that some errors were returned from the validation against the XSD schema (maybe due to the contextual presence in the record of PT_FreeText elements and gmx:Anchor elements, but this is still an open issue for us).

If this is still an open issue, we kindly ask you to open a new issue in the INSPIRE Reference Validator helpdesk.

Otherwise we can also open it, but it would be better if you could describe it providing all details and multilingual metadata example files.

jescriu commented 2 years ago

Thanks @heidivanparys for your values inputs on how the supported languages could be derived from WMS, WFS and ATOM implementations.

Before going ahead with this - Would it make sense trying to retrieve them automatically by making iterative requests LIKE 'https://service-uri?service=wms&request=GetCapabilities&language=xxx', changing xxx in each iteration with the different language values according the applicable domain? This would require that when a language is not available, the service should return an exception error, or a black capabilities (which I think it is nor currently the case as per the TG NS).

The default language could somehow be retrieved in a similar way, comparing which Capabilities document returned by 'https://service-uri?service=wms&request=GetCapabilities&language=xxx' is equal to the Capabilities document returned by 'https://service-uri?service=wms&request=GetCapabilities'.

Just an idea.

MarieLambois commented 2 years ago

I think we could distinguish two cases: -When only one language is available. In this case (most common I think) maybe we could have the language assimilated to the one of the metadata (as @AntoRot suggests) or have it in a keyword. -When several langage are supported (maybe @LauraAlemany has a use case, I remember something about that), the language becomes a parameter of the request so I would have it like this: `

FRE ENG ESP ` However I was not able to formally identified if this is really allowed to declare a new parameter like this.
heidivanparys commented 2 years ago

Regarding the language parameter: it seems that this really is an INSPIRE extension.

(1) From the TG download:

The HTTP/GET binding of the GetCapabilities-Operation is extended by an additional parameter that indicates the client‘s preferred language.

With the on-going development of OWS Common it is expected that future versions of OGC Standards will include language support. For specific technical reasons, the concepts used for OWS common are not suitable to extend the current standards. However, with the availability of future versions of the OGC base standards the recommended approach to support multilingualism may need to be revisited.

(2) From the OGC Web Services Common Specification:

This document specifies how multiple text values in different languages shall be encoded in XML for specific parameters. [...]

(see examples with xml:lang in an earlier comment; something like this is not present at all in the older WMS specification)

The mechanism for negotiating the language(s) to be communicated is beyond the current scope of this document.

So the OWS 1.1 specification does not specify how to retrieve service metadata in a specific language, it only specifies what the response of a request should look like.

(3) From https://github.com/inspire-eu-validation/RETIRED-ats-download-predefined-wfs/issues/8#issuecomment-144057137

[...] the similar case of the LANGUAGE parameter, required an extension. [...]

I agree with @AntoRot and @MarieLambois that the best solution for the case of one language would be assuming that the service metadata language is the same as the dataset metadata language. It could perhaps be written in a requirement that “the service metadata shall be written in the same language as the dataset metadata”? This requirement can only be verified manually.

LauraAlemany commented 2 years ago

We (SDI of Spain) think that it is important to specify in the GetCapabilities the language, so that all users can see in which languages they can request the GetCapabilities response.

In our opinion is better to maintain the Extended Capabilities.

As @MarieLambois said we have some services with the GetCapabilities response in more than one language, here it is an example (not sure if this was what you were asking for): https://www.ign.es/wms-inspire/unidades-administrativas?request=GetCapabilities&service=WMS https://www.ign.es/wms-inspire/unidades-administrativas?request=GetCapabilities&service=WMS&language=eng

AntoRot commented 2 years ago

@jescriu,

Sorry for the late reply.

If this is still an open issue, we kindly ask you to open a new issue in the INSPIRE Reference Validator helpdesk.

I already had the intention to open an issue in the INSPIRE Reference Validator helpdesk, but I'm afraid that the issue I raised concerns the ISO XSD schema and not the INSPIRE validator.

Furthermore, also opening an issue in the ISO schema repository may not automatically mean that issue will be addressed as we still use the deprecated ISO 19115 Standard.

Nevertheless, I will do it (in the INSPIRE validator helpdesk) also in order to collect comments and similar experiences as well as possible solutions that someone else may have already adopted.

AntoRot commented 2 years ago

A summary proposal:

In any case, linked with the latest sentence, the data and services providers may add the optional ExtendedCapabilities section.

jescriu commented 2 years ago

In my view, taking into account the different inputs from @AntoRot but also from other participants in the thread, it makes sense to propose:

NOTE 1: It is reasonable to expect multilingualism of services when multilingualism of metadata is used. NOTE 2: To be checked if another upcoming multilingualism encodings should be taken into account.

MarieLambois commented 2 years ago

Discussion 2022-02-25: @jescriu reminded about the possibility to obtain the suported languages automatically (see https://github.com/INSPIRE-MIF/gp-data-service-linking-simplification/issues/43#issuecomment-1022528187) by requesting the service in all the possible different languages and analysing its responses. But this functionality could only be probably added to the new INSPIRE Geoportal and the implications in service performance could not be negligible. @AntoRot (IT) totally agree on this approach. He also reminded the possibility of mapping (1) the DefaultLanguage element to the dataset metadata language and (2) the SupportedLanguage to the locale element (PT_locale) values - in case of having multilingual dataset metadata, as he suggested in https://github.com/INSPIRE-MIF/gp-data-service-linking-simplification/issues/43#issuecomment-1022105139. About this proposal, @AntoRot (IT) mentioned that he noticed a potential issue on the validation of multilingual metadata (using the locale element) when the record contains Anchor encodings. He will open an issue in the INSPIRE Reference Validator helpdesk. @MarieLambois (FR) considered this last proposal quite complex. Additionally, she stated that this solution will not work for France, because they are offering multilingual dataset metadata but not service Capabilities documents in different languages (because of limited resources to maintain them). @idevisser (NL) reminded the https://github.com/INSPIRE-MIF/gp-data-service-linking-simplification/issues/43#issuecomment-1050773537 added to the dicussion thread. On it: in case of one language (DefaultLanguage), use the dataset metadata language; in case of further supported languages, in case of WFS or Atom, use the attribute xml:lang for the two elements affected (title and abstract), and; in case of further supported languages, in case of WMS, keep the possibility to include the (optional) ExtendedCapabilities section, including the SupportedLanguages elements. Finally, the attendees agreed on taking this last approach, but giving aproximately 1 week to further discuss about it in the related issue thread (https://github.com/INSPIRE-MIF/gp-data-service-linking-simplification/issues/43)

MarieLambois commented 2 years ago

(very first draft, I will improve. @heidivanparys and others feel free to suggest improvements)

Proposed mapping and rationale

The Default Language will be set to the Dataset Metadata Default Language. The other supported language (if any) will be maped to the xml:lang for WFS and ATOM and the SupportedLanguages element of the INSPIRE GetCapabilities extension.

Detailed mapping description

For multiple language support:

 <ows:ServiceIdentification>
      <ows:Title xml:lang="en">My WFS</ows:Title>
      <ows:Title xml:lang="da">Min WFS</ows:Title>
      <ows:Abstract xml:lang="en">My abstract</ows:Abstract>
      <ows:Abstract xml:lang="da">Min abstrakt</ows:Abstract>
      <ows:Keywords>
        <ows:Keyword xml:lang="en">My keyword</ows:Keyword>
        <ows:Keyword xml:lang="da">Mit emneord</ows:Keyword>
      </ows:Keywords>
      <ows:ServiceType>WFS</ows:ServiceType>
      <ows:ServiceTypeVersion>2.0.2</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>2.0.1</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>2.0.0</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>1.1.0</ows:ServiceTypeVersion>
      <ows:ServiceTypeVersion>1.0.0</ows:ServiceTypeVersion>
      <ows:Fees>NONE</ows:Fees>
      <ows:AccessConstraints>NONE</ows:AccessConstraints>
   </ows:ServiceIdentification>

Changes to the current INSPIRE framework

In the Download Service Technical Guidelines, add the following requirement: Requirement: If the service supports several languages and if there is no Extended Capabilities, the xml:lang attribute should be used to define the language used. (insert the example above)

jescriu commented 1 year ago

This issue has been taken into account the current specification.