admin-shell-io / aas-specs

Repository of the Asset Administration Shell Specification IDTA-01001 - Metamodel
https://industrialdigitaltwin.org/en/content-hub/aasspecifications
Creative Commons Attribution 4.0 International
45 stars 26 forks source link

[BUG] examples use invalid lang tags #387

Closed VladimirAlexiev closed 3 months ago

VladimirAlexiev commented 3 months ago

XML and RDF lang tags are regulated by BCP47 sec 3.1 and recorded in IANA Language Subtag Registry (Google Sheet iana-lang-tags, last updated 12 April 2017, is an easier way to read it). See https://vocab.getty.edu/doc/#Language for more links and examples of custom lang tags.

The examples here use invalid lang tags, eg:

This means that if you try to ingest them in an RDF repo, you will get errors or warnings. I don't know of a "random lang tag" generator, but please use a couple fixed tags, eg en, en-US, de, de-AT. (If you want more variety, use sr-Cyrl, sr-Latn, with a bow to @mristin :-)

mristin commented 3 months ago

Hi @VladimirAlexiev, Thanks for the report! We actually already noticed the issue: https://github.com/aas-core-works/aas-core3.0-testgen/issues/16

This needs to be fixed in https://github.com/aas-core-works/aas-core3.0-testgen and back-propagated here.

Just for a future reference: it would be easiest to fixed the fuzzed examples in aas-core3.0-testgen: https://github.com/aas-core-works/aas-core3.0-testgen/blob/e1087880960dae119fd123af29adc02e7dbf340a/aas_core3_0_testgen/frozen_examples/pattern.py#L356-L366

I'll try to have a look at this soon-ish (next two weeks).

VladimirAlexiev commented 3 months ago

You can fuzz using custom langs (eg x-foobar) and subtags (eg sr-x-foobar-SR-Cyrl). The custom part should be 6 (or 7?) chars max.

mristin commented 3 months ago

@VladimirAlexiev I fixed the issue in https://github.com/aas-core-works/aas-core3.0-testgen/pull/28. Could you please verify that all the language tags are valid and can be ingested before I copy the examples to this repository?

VladimirAlexiev commented 3 months ago

@mristin

The structure of tags is described in https://www.rfc-editor.org/rfc/rfc5646.html, and the registry is https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

I found a couple of doubtful tags. But I think it's ok to leave them in, because they are still valid, or could be valid if the respective singleton chars were registered in the future.

"en-a-myext-b-another" "zh-CN-a-myext-x-private" "en-US-u-islamcal"

mristin commented 3 months ago

@VladimirAlexiev

I found a couple of doubtful tags.

Can you please quickly test with your tools, and check that they do not complain about those tags? If they do, then I'll remove them to avoid the confusion.

VladimirAlexiev commented 3 months ago

They are accepted. Closing

mristin commented 3 months ago

@VladimirAlexiev thanks! I'll update the examples in this repository this Wed.