Open Sgaff opened 3 years ago
Schematron only checks there is a 3 letter language code in the codeListValue; no check of codelist URI.
That's good then from the point of view of the change, as it would purely be edits in the website.
If you go to page http://www.loc.gov/standards/iso639-2/
you can see that there is a link to ISO 639-2 Code List
from it, so http://www.loc.gov/standards/iso639-2/
is not a link to the code list, and if the INSPIRE validator expects it, then that must surely be an error.
Valid URLs to the code list are:
A link to the code eng
in the code list is:
https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=130
Basically think that INSPIRE validator is wrong for rejecting URLs with php
in them
Can we feed this back to INSPIRE then so they can do a corrigendum? As the https://www.loc.gov/standards/iso639-2/ is the example encoding in the TG and that would need to be changed as well.
Sean
I'll also take view, based on James' comments, that the GEMINI interpretation is correct and will leave it this way for imminent MEDIN release.
@Sgaff : could you raise it as a new issue against the INSPIRE TG, at https://github.com/INSPIRE-MIF/technical-guidelines/issues? Or if it's more an issue with their validator than their text, then raise it at: https://github.com/INSPIRE-MIF/helpdesk-validator
And then close it here.
Related issue / pull request at INSPIRE MIF: https://github.com/inspire-eu-validation/metadata/pull/175.
Note: this the validator sticking making the current implementation more tolerant, but not taking into account James' view here that they should be targeting something that returns a value.
@PeterParslow to check whether the suggestion above (to link to the actual value in the codelist) is still valid for inspire, and potentially raise it as an issue with them
Sean is (unsurprisingly) correct that http://www.loc.gov/standards/iso639-2/ is not
Metadata language GEMINI Guidance 1. "It is recommended to select a value from a controlled vocabulary, for example that provided by ISO 639-2 which uses three-letter primary tags with optional subtags." rather softer than INSPIRE TG Requirement C.5: metadata/2.0/req/common/metadata-language-code
Similarly, Dataset language (INSPIRE "Resource language") GEMINI Guidance 1. "A code should be selected from ISO 639-2, which uses three-letter primary tags with optional subtags – see http://www.loc.gov/standards/iso639-2/php/code_list.php" is softer than TG Requirement 1.6: metadata/2.0/req/datasets-and-series/resource-language
(wording below is from INSPIRE TG Requirement C.5; 1.6 is almost identical) "The language of the provided metadata content shall be given. It shall be encoded using gmd:MD_Metadata/gmd:language/gmd:LanguageCode element. The attribute codeListValue shall contain one of the three-letter language codes of the ISO 639-2/B code list. The attribute codeList shall be either http://www.loc.gov/standards/iso639-2/ or http://id.loc.gov/vocabulary/iso639-2.
Only the code values for the languages of the Community[19] shall be used.
The multiplicity of this element is 1."
Historically, this was because GEMINI allowed itself to be used for records that were not INSPIRE compliant, for example in Welsh - not one of the "languages of the Community" but represented by an ISO 639 code (sadly, two!).
We could harden our Guidance 1 to require ISO 693-2 3-letter codes. This would technically be a breaking change, but may not impact any instances (would need to check).
James is correct (Oct 6, 2021) that http://www.loc.gov/standards/iso639-2/ is not a link to the code list; it's a link to a kind of landing page about the code list. https://www.loc.gov/standards/iso639-2/php/code_list.php links to the ISO 639-2 code list. But INSPIRE requires https://www.loc.gov/standards/iso639-2/php/ or http://id.loc.gov/vocabulary/iso639-2 - that second one does actually redirect to the ISO 639-2 code list: albeit a rather different visual representation of it!
My opinion: direct links to individual codes (e.g. https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=130) would be useful, but should not be the value of the codeList attribute, which after all is a URL for the list, not the individual value. I don't see anywhere in an ISO 19139 code list to put a direct link to the individual value. Nor would that be valid for INSPIRE.
What would be valid INSPIRE and at least give a link that goes more directly to the code list, would be to require/recommend http://id.loc.gov/vocabulary/iso639-2 - but I don't know if anyone has been using that so far!
Personal thought: no further GEMINI change; we've got the INSPIRE validator "softened" / corrected
Hi,
The current guidance on the GEMINI pages for metadata language and for dataset language states that the codelist string that users should quote for the ISO language codes is
http://www.loc.gov/standards/iso639-2/php/code_list.php
However, if you attempt to run a full XML file through the INSPIRE validator with this encoding in it, it fails on the language element. After some playing around, and looking in inspire-tg-metadata-sio19139-2.0.1.pdf, I identified the problem INSPIRE had as being the presence of the /php/code_list.php part of the string.
I re-built my XML so the language portion was as follows