admin-shell-io / aas-specs

Repository of the Asset Administration Shell Specification IDTA-01001 - Metamodel
https://admin-shell-io.github.io/aas-specs-antora/index/home/index.html
Creative Commons Attribution 4.0 International
47 stars 26 forks source link

Relaxation of idShort restrictions (Constraint AASd-002) #295

Open sebastiankb opened 10 months ago

sebastiankb commented 10 months ago

What

idShort is currently very restricted defined in the specification:

idShort of Referables shall only feature letters, digits, underscore (""); starting mandatory with a letter, i.e. [a-zA-Z][a-zA-Z0-9]*.

idShorts like "min-temperature-value" or "modbus:function" are not allowed.

idShorts have a variable-like character. However, I cannot follow why only "_" is allowed as a special character. This complicates the reusage of existing names (that are also variable-like characters), e.g., from standards that uses prefixes.

I tried to understand, where the restrictions come from and I received two resealable answers so far.:

  1. "should not cause conflicts with the idShort path approach of the REST interface."
  2. "variables in programming languages also do not support special characters"

Mitigation

Point 1 is not really a justification for this restriction, URL path allows many special characters such as “-”, “$”, “:” etc Point 2 is more justifiable. However, it is hard to generalize this, since some programming languages allow “$” in variable names. But seriously questions: Why should idShort names identical be reflected as variable names in a programming language? Which tool/lib is doing this? AAS comes with its own serialization approaches. It makes more sense to reflect the serialization model in a programmatically manner (if needed).

Proposal

It makes sense, that idShorts have a variable-like character, however, more flexibility of the idShort values would be desirable, e.g., to adhere existing name conventions that uses also “-” and “:” in the name value.

Proposal 1: Allow more special characters such as “-” and “:”

and/or

Proposal 2: Allow “%” --> allows URL encoding (all special characters can be reflected)

Note: Proposal 1 and 2 are backward compatible!

BirgitBoss commented 10 months ago

There is a third reason:

  1. "should not cause conflicts with the idShort path approach of the REST interface."
  2. "variables in programming languages also do not support special characters"
  3. the Value-Only approach of the http/REST API is based on the idShort-Names, so only names valid in JSON should be allowed

We additionally do have a display name (in different languages even) for more elaborate names

sebastiankb commented 10 months ago

Mitigation to the third reason: JSON key names are very flexible, e.g. spaces are also permitted ":", "-" etc.. Also see rfc8259

We additionally do have a display name (in different languages even) for more elaborate names

I know, but it's not the same, as display names are mainly used for (human-readable) UI purposes. This is more about keeping the naming convention for idShorts, especially for terms/variables that already exist, e.g. from standards. Many RFC and W3C standards use variable names which include "-" and ":" characters. In terms of interoperability and understandability, it would be nice if established names could be used as is.

BirgitBoss commented 10 months ago

Discussion in Workstream AAS on 2023-11-23

Proposal:

allow "-"

Wish for 4.0: idShort of Referables shall only feature letters, digits, hyphen ("-") and underscore (""); starting mandatory with a letter and not ending with a hyphen or underscore, i.e. [a-zA-Z]([a-zA-Z0-9-]*[a-zA-Z0-9])? .

Decision: Due to backward compatiblity we need to allow underscore at the end of the idShort:

idShort of Referables shall only feature letters, digits, hyphen ("-") and underscore (""); starting mandatory with a letter and not ending with a hyphen, i.e. [a-zA-Z]([a-zA-Z0-9-]*[a-zA-Z0-9_])? .

To be checked: regular expression with "-": correct like this?

sebastiankb commented 10 months ago

this should be the right regular expression:

^[a-zA-Z][a-zA-Z0-9_-]*[a-zA-Z0-9_]+$

BirgitBoss commented 7 months ago

this should be the right regular expression:

^[a-zA-Z][a-zA-Z0-9_-]*[a-zA-Z0-9_]+$

I think the regular expression is not backward compatible because it requests at least two characters (we relaxed this constraint with V3.0RC02) This should be correct:

^[a-zA-Z] ([a-zA-Z0-9-]*[a-zA-Z0-9]+ | [a-zA-Z0-9_]* ) $

BirgitBoss commented 7 months ago

Another issue: in Annex C Backus-Naur-Form we do not explain the characters ^ and $. What do they mean? ^ means beginning of line $ means "end of line"

BirgitBoss commented 7 months ago

see decision https://github.com/admin-shell-io/aas-specs/issues/295#issuecomment-1824136575

#Constraint AASd-002:# _idShort_ of __Referable__s shall only feature letters, digits, hyphen ("-") and underscore ("_"); starting mandatory with a letter, and not ending with a hyphen, i.e. ^[a-zA-Z] ([a-zA-Z0-9_-][a-zA-Z0-9_]+ | [a-zA-Z0-9_] ) $.]

For SMT submodel elements (see https://industrialdigitaltwin.org/en/content-hub/create-a-submodel) also other special characters like "{000}" are used.

Alternatives: a) make three constraints, one for submodel elements with a Submodel instance and one for submodel elements within a Submodel template and one for elements not being a submodel element but referable b) only one constraint for submodel instances (the existing AASd-002). This means everything allowed in SMT c) extend existing constraint AASd-002

For a) and c) the question is how strict to make it: just #Constraint AASd-00x:# _idShort_ of __SubmodelElement__s within a Submodel template (Submodel/kind = Template) shall only feature letters, digits, hyphen ("-") and underscore ("_"); starting mandatory with a letter, and not ending with a hyphen. Additionally for wildcards also {00} or {000} is allowed to be used. i.e. ^[a-zA-Z] ([a-zA-Z0-9_-][a-zA-Z0-9_]+ | [a-zA-Z0-9_] ) < { 0[0]+ }$.

sebastiankb commented 7 months ago

this should be the right regular expression: ^[a-zA-Z][a-zA-Z0-9_-]*[a-zA-Z0-9_]+$

I think the regular expression is not backward compatible because it requests at least two characters (we relaxed this constraint with V3.0RC02) This should be correct:

^[a-zA-Z] ([a-zA-Z0-9-][a-zA-Z0-9]+ | [a-zA-Z0-9]_ ) $

The pattern I have proposed takes this restriction into account with at least two characters. You can test this with this tool: https://regex101.com/

One character --> no match

image

Two character --> match

image

sebastiankb commented 7 months ago

For SMT submodel elements (see https://industrialdigitaltwin.org/en/content-hub/create-a-submodel) also other special characters like "{000}" are used.

Alternatives: a) make three constraints, one for submodel elements with a Submodel instance and one for submodel elements within a Submodel template and one for elements not being a submodel element but referable b) only one constraint for submodel instances (the existing AASd-002). This means everything allowed in SMT c) extend existing constraint AASd-002

A good catch. I would prefer to make an exception for templates. On the other hand, it becomes inconsistent if we have different variants of constraints depending on the AAS form (template/instance/type). We should discuss this in the next meeting.

BirgitBoss commented 7 months ago

into account with at least two characters

This is exactly NOT valid, 1-letter idShorts are valid as well

sebastiankb commented 7 months ago

Sorry, I misunderstood. However, your proposed reg expression above do not allow 1-letter idShort either.

This version should work:

^[a-zA-Z]([a-zA-Z0-9_-]*[a-zA-Z0-9_]+)?$