INSPIRE-MIF / helpdesk-validator

Community discussion forum for INSPIRE validation issues
42 stars 22 forks source link

Validation error on LanguageCode #52

Closed alitka closed 4 years ago

alitka commented 5 years ago

Testing a metadata record with the "new TG 2.0 tests" causes errors concerning the code list attribute at several points.

The test claims errors concerning the code list attribute at several points, e.g. the language code in “md common req C.5”. The accompanying text says that the code list shall be 'http://www.loc.gov/standards/iso639-2/'. This conflicts with TG Metadata 2.0.1, where in 2.1.1 (Encoding of code list values) is stated that only the codeListValue attribute is relevant (TG Requirement C.3). The additional note names both the value of the codeList attribute (a URL that references a code list definition within a register or a code list catalogue) and the textual content of the ISO 19139 element to be purely informative.

The test further detects similar alleged violations in citing code lists regarding the role code of contacts, date types of given date information, scope codes of quality information and so on.

josemasensio commented 5 years ago

Dear @alitka,

Thank you for your comments.

We are investigating the error. Maybe the test 'md common req C.5: Language code' should be modified.

We will come back to you with the possible solution.

Best regards.

MarcoMinghini commented 5 years ago

Dear @alitka,

after revising the TG, @fabiovin and I have a slightly different interpretation of the TG. Requirement C.3 only sets the rule for the use of the codeListValue attribute, and in the premise says that the relevant requirements mention the code list to be used each time.

In the specific case of the Metadata language element, Requirement C.5 forces the use of the ISO 639-2/B code list (http://www.loc.gov/standards/iso639-2/).

The wording "purely informative" in C.3 might be the source of this (possible) misunderstanding.

alitka commented 5 years ago

It is indisputable that the content to be used for an element based on a CodeList has to be taken from a quite specific CodeList. The "source" for the permissible values of the codeListValue for the language code is uniquely regulated in C.5 . In other cases (role code, date type etc.) the ISO regulates this itself, as it contains these CodeLists directly. Only the footnotes name a particular code list URL to be used.

The two paragraphs following requirement C.3 are in conflict with this as there can be individually extended CodeLists, which of course contain the official values as well (even if you are not allowed to use these individual values for INSPIRE in this case). Nevertheless under codeList you would have a URL that differs from the requirement.

MarieLambois commented 5 years ago

I think that footnotes in Metadata TG v2 should be reworded. I agree with Anja, there is no need/interest to fix the value of the codelist attributes. These are not URIs but really just the location of the resource. This location might change, other codelists (extended, localized, etc) might e used. Fixing those codelist attributes would be against the ISO flexibility of codelists.

AntoRot commented 5 years ago

I agree that there is a conflict between the paragraphs following TG req C.3 and the obligation for the value of the codelist attribute given in the footnotes linked to different requirements concerning the metadata elements encoded using specific codelists (TG req C.5 for the language).

But I noticed that:

Consequently, my conclusion is that, at least for the language, we should use the value of the codelist fixed in the footnote of the TG req C.5.

AntoRot commented 5 years ago

An additional comment. I tried to test a metadata 2.0 XML file using the value "http://www.loc.gov/standards/iso639-2" (without the final /) and the same error message returned (that the codelist attribute is not correct), although the URL is resolved. In this case I think that the check could be less strict accepting both the values (with or without the final /).

MarcoMinghini commented 5 years ago

2017.4 meeting 2019-05-22

alitka commented 5 years ago

Just a note by Ilkka Rinne from 2016:

[...] Based on that discussion [...], it became clear that the URIs given as the codeList attributes of the ISO code list elements is not authoritative, but informative: the authoritative source of the possibly values of the particular code list are given in the ISO standards document, regardless of the resource pointed by the codeList attribute. This is probably the reason why the codeList attribute is ignored by the validators. For the ISO code lists the validators shall only check that the codeListValue is one of the permitted values provided by the standard document for the particular code list, and can thus ignore the codeList attribute.

[...] However it should be pointed out, that as far as I know there are no authoritative and permanent copies of the ISO code lists expressed in XML. Neither the ISO copies under http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources nor under http://www.isotc211.org/2005/resources are guaranteed to be permanently available.

Neither the INSPIRE Metadata guidance version 1.3 nor 2.0 shall mandate the use of particular URLs for the ISO code lists for this reason.

(https://themes.jrc.ec.europa.eu/discussion/view/95954/what-to-use-for-ci-datetypecode-codelist)

josemasensio commented 5 years ago

Dear @alitka,

We have modified the test allowing also code list http://www.loc.gov/standards/iso639-2 (without the final /) to be accepted.

Waiting for your feedback.

Regards.

josemasensio commented 5 years ago

Afert some internal tests, we checked that everything is working fine, so we will mark as solved this issue.

Regards.

eblondel commented 5 years ago

Hello @josemasensio @inakidiazdecerio thanks for this fix. When do you expect to release this fix on official INSPIRE validator http://inspire.ec.europa.eu/validator ? Thanks in advance

danielnavarrogeo commented 4 years ago

Now the codeList is only checked in 'C.5 Language Code' and '1.6 Resource Language' tests. https://github.com/inspire-eu-validation/community/issues/131

AntoRot commented 4 years ago

Despite the decision to allow also for the code list http://www.loc.gov/standards/iso639-2 (without the trailing slash) to be accepted (see https://github.com/inspire-eu-validation/community/issues/52#issuecomment-496237211), the validator continues to raise an error on language code element if such code list is used in the metadata records. Please update the reference validator according to the decision mentioned above.

danielnavarrogeo commented 4 years ago

Dear @AntoRot , 

thanks for pointing the issue.

Unfortunately the development of the solution was not fully implemented and this bug wrongly passed through the Helpdesk management procedure in place at that time. After the implementation, as we did not receive feedback from the user, a reminder was sent and, after 2 weeks time, the issue was marked as solved.

Regrettably, this bug slipped this development procedure causing this issue to appear afterward. We sincerely apologize for the inconvenience caused.

As soon as you have raised the error we have verified, implemented and updated the validator accordingly. We have treated this issue as hotfix so it will be included in the next release (v2020.1 - 18/03/2020).

Finally, we would like to remark that, in order to prevent this kind of errors to happen again, we have improved the Helpdesk management procedure, including a peer-review process in the pull request, as well as the acceptance of the users to the developed solution.

Regards

AntoRot commented 4 years ago

Dear @danielnavarrogeo,

Thank you very much for the prompt reply and solution of the issue.

Regards