Closed VladimirAlexiev closed 3 months ago
Hi @VladimirAlexiev, Thanks for the report! We actually already noticed the issue: https://github.com/aas-core-works/aas-core3.0-testgen/issues/16
This needs to be fixed in https://github.com/aas-core-works/aas-core3.0-testgen and back-propagated here.
Just for a future reference: it would be easiest to fixed the fuzzed examples in aas-core3.0-testgen: https://github.com/aas-core-works/aas-core3.0-testgen/blob/e1087880960dae119fd123af29adc02e7dbf340a/aas_core3_0_testgen/frozen_examples/pattern.py#L356-L366
I'll try to have a look at this soon-ish (next two weeks).
You can fuzz using custom langs (eg x-foobar
) and subtags (eg sr-x-foobar-SR-Cyrl
). The custom part should be 6 (or 7?) chars max.
@VladimirAlexiev I fixed the issue in https://github.com/aas-core-works/aas-core3.0-testgen/pull/28. Could you please verify that all the language tags are valid and can be ingested before I copy the examples to this repository?
@mristin
The structure of tags is described in https://www.rfc-editor.org/rfc/rfc5646.html, and the registry is https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
I found a couple of doubtful tags. But I think it's ok to leave them in, because they are still valid, or could be valid if the respective singleton chars were registered in the future.
zh-cmn
is valid but redundant (i.e.there's a better representation):
Type: redundant
Tag: zh-cmn
Description: Mandarin Chinese
Added: 2005-07-15
Deprecated: 2009-07-29
Preferred-Value: cmn
Singleton chars (except x, i
):
The singleton MUST be one allocated to a registration authority via the mechanism described in Section 3.7" Tags that use extensions (examples ONLY -- extensions MUST be defined by revision or update to this document, or by RFC)
"en-a-myext-b-another" "zh-CN-a-myext-x-private" "en-US-u-islamcal"
@VladimirAlexiev
I found a couple of doubtful tags.
Can you please quickly test with your tools, and check that they do not complain about those tags? If they do, then I'll remove them to avoid the confusion.
They are accepted. Closing
@VladimirAlexiev thanks! I'll update the examples in this repository this Wed.
XML and RDF lang tags are regulated by BCP47 sec 3.1 and recorded in IANA Language Subtag Registry (Google Sheet iana-lang-tags, last updated 12 April 2017, is an easier way to read it). See https://vocab.getty.edu/doc/#Language for more links and examples of custom lang tags.
The examples here use invalid lang tags, eg:
This means that if you try to ingest them in an RDF repo, you will get errors or warnings. I don't know of a "random lang tag" generator, but please use a couple fixed tags, eg
en, en-US, de, de-AT
. (If you want more variety, usesr-Cyrl, sr-Latn
, with a bow to @mristin :-)