eclipse-esmf / esmf-semantic-aspect-meta-model

Formal and textual specification of the Semantic Aspect Meta Model (SAMM)
https://eclipse-esmf.github.io/samm-specification/snapshot/index.html
Mozilla Public License 2.0
46 stars 9 forks source link

[Task] XML string ⟷ JSON string #177

Open mristin opened 1 year ago

mristin commented 1 year ago

Is your task related to a problem? Please describe. JSON strings can represent arbitrary Unicode strings.

In constrast, XML strings are constrained to the ranges [\x09\x0A\x0D\x20-\uD7FF\uE000-\uFFFD\u00010000-\u0010FFFF] (according to the de facto standard XML 1.0).

This means that you can not encode characters such as \x00 or \x01 in XML even when escaped. For example, this is invalid XML:

<something>&#x00;</something>

For more information, see first https://www.w3.org/TR/xmlschema-2/#string and then follow to https://www.w3.org/TR/2000/WD-xml-2e-20000814#dt-character.

Additional context @atextor You are probably aware of the issue. I don't know if it directly impacts your design. I just wanted to give you a heads-up in case it was not on your radar.

This is related to #175.

atextor commented 1 year ago

You are of course correct in that JSON strings have a different value range than XSD strings. However, this is not a problem because XSD string's value range is a subset of JSON's.

The Payloads section states the following:

[...] the Property is serialized as ${propertyName}: ${value} where ${value} is the JSON serialization of the respective Property’s value, details on mapping of the data types are given in Data type mappings.

In other words, if a bamm:Property has an effective data type of xsd:string, then the JSON payload data for the corresponding aspect may only contain values that are also valid xsd:string; you're not allowed to put 0x00 in the JSON payload.

I'll add the to be discussed label, because it might be necessary (at least helpful) if this is explained more explicitly.

mristin commented 1 year ago

@atextor also mind the SDKs.

mristin commented 1 year ago

@atextor sorry, I also forgot to mention an important detail. The allowed ranges differ between XML 1.0 and 1.1. The above ranges correspond to XML 1.0. XML 1.1 is less restrictive:

^[\u0001-\uD7FF\uE000-\uFFFD\U00010000-\U0010FFFF]*$

You probably want to specify and check the allowed ranges explicitly, and not rely on the user's assumption of the XML version.