Open relu91 opened 2 years ago
I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?
The current algorithm is essentially inspired by EXI for JSON.
It maps JSON to XML.
What you are describing is mapping XML to JSON. I think this is not possible in a generic way since XML is more powerful, e.g., XML has the power to
<ContactId hasFoo="true">2147483647</ContactId>
xsi:type
castingnillable
in an instance to indicate that there is no content..Hence, I don't think this will ever work consistently..
w.r.t. describe XML to JSON conversion I am not sure how to move on from here best.
In my own defense I have just copied from @takuki's initial contribution and worked with XML only in configuration files and not in APIs, so I do not have a good opinion.
@danielpeintner is it possible to say that we can always define a loose schema which would validate more XML payloads than intended.
By the way, I had asked the JSON Schema community about documented support for such cases and the answer is no. There is a possible clean way with https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.8.5 where we can clearly say that a schema is only for this xml payload.
I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?
The current algorithm is essentially inspired by EXI for JSON.
It maps JSON to XML.
What you are describing is mapping XML to JSON. I think this is not possible in a generic way since XML is more powerful, e.g., XML has the power to
- have attributes and character for a simple type element
<ContactId hasFoo="true">2147483647</ContactId>
- well-defined sequences of elements ...
- use
xsi:type
casting- use
nillable
in an instance to indicate that there is no content..- ...
Hence, I don't think this will ever work consistently..
w.r.t. describe XML to JSON conversion I am not sure how to move on from here best.
I see, what if we describe a default algorithm that converts attributes/well-defined seq/xsi:type/nillable etc. and then configure it using XML protocol binding vocabulary terms? For example:
{
"hasFoo" : true
"node_value": 2146 // real value
}
xml:no-attributes: true
or at servient level. 🤔 Just some additional options....
By the way, I had asked the JSON Schema community about documented support for such cases and the answer is no. There is a possible clean way with https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.8.5 where we can clearly say that a schema is only for this xml payload.
yeah but this is really almost like no validation at all... it just says put everything inside a string and treat it as XML.
Found also this document: https://www.w3.org/2011/10/integration-workshop/s/ExperienceswithJSONandXMLTransformations.v08.pdf
It defines "friendly xml". maybe we can have the rule to map only "friendly XML" and fall back to string if it is unfriendly. just throwing ideas on the table...
Many people nowadays run into the need to support JSON & XML at the same time. Even XQuery, which used a query language for XML has support for JSON nowadays.
Having said that, I don't think it is possible to properly support XML validation based on JSON schema. Let me throw in another proposal. Since I believe proper XML validation needs XML schema. WHAT if we use real XML schema.
XSD example for a person
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:any minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Would it be feasible to represent the XSD in JSON schema... such as
{
"title": "Person",
"type": "string",
"format": "xml",
"format-constraints": "<xs:element name=\"person\"> ..."
}
The keys format
and format-constraints
are just there to highlight what I mean. We can use whatever term we want. We should just make sure that JSON validators can handle it properly.. in the worst case a validator should accept any string (or respectively any XML).
Others validators that support proper XML validation can use the information given in format-constraints
to validate...
Moreover, this allows to support other formats in the future also with the same principle.
What do you think?
I think it is an option. One downside that I see is that from the application point of view we are just talking about string
s. It would be better to still maintain the ability to properly describe a formal data type. Basically, it works well for validating the payload but not to "transform"/"map"/"convert" to a consistent DataSchema value.
From the call of 15.12:
contentMediaSchema
in the forms and extend the affordance level schema with an XML schema like @danielpeintner example above.Decision: @danielpeintner will provide XML, XML schema examples where these corner cases exist. These edge cases should not require the expansion of the data schema. We will document how implementors should handle these cases but we need some examples and further iteration on this.
Let me give you some examples where I think it is difficult to represent XML as JSON.
Note1: I am not saying these are good examples. I just want to show that I think XML is more expressive. Note2: I tried to use https://www.freeformatter.com/xml-to-json-converter.html to experiment a bit and some results are a bit surprising to me.
In the end I don't know whether the examples I give below are good practice... I don't think so.. in most of the cases.
XML allows to type-cast types to a subtype. For example an xsd:decimal
can by typed to xsd:integer
or xsd:unsignedInt
in the instance document.
TypeCasts are also possible with complexType
s also.
Moreover, any element can be marked as nillable in an XML instance (xsi:nil="true"
) if the schema contains nillable="true"
for this element
XSD allows to specify simple values like a simpleValue types as integer but still have an attribute. This would need to be mapped to an object in JSON schema.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="simpleValue">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="id" use="required" type="xs:string"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>
-->
<simpleValue id="dd">12</simpleValue>
The online tool converts it to
{
"@id": "dd",
"#text": "12"
}
Which is surprising to me.
value
is once an attribute and the other time a element name and can have different type.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="conflictingNames">
<xs:complexType>
<xs:sequence>
<xs:element name="value" type="xs:integer" />
</xs:sequence>
<xs:attribute name="value" type="xs:string"/>
</xs:complexType>
</xs:element>
</xs:schema>
-->
<conflictingNames value="XX">
<value>12</value>
</conflictingNames>
In complex types by default using xsd:sequence the order of element matters. In JSON this is not the case. Anyhow, I don't think this is a big deal. What I am not sure about is if we have the same element name declared twice. Not sure if this is actually an issue
<xs:element name="orderWithDifferentElements">
<xs:complexType>
<xs:sequence>
<xs:element name="value" type="xs:integer" />
<xs:element name="value" type="xs:integer" />
</xs:sequence>
</xs:complexType>
</xs:element>
XSD allows to specify any
element or anyAttribute
to appear. I think this is covered by JSON schema and "additionalProperties": true
What is not covered is the feature than anyX
can be limited to a given namespace.
I had to learn a bit of XML Schema to follow, so I am not sure if my answers are really correct.
anyOf
and give type:null
for one of the optionsMy assessment of the situation: In X percent of the cases we can have a JSON Schema representation. To me that X feels like 80% but then it is my opinion with not much experience with XML-based APIs.
Nillable: This is possible with JSON Schema if I say anyOf and give type:null for one of the options
In XSD you can use that for any type. With JSON schema you would need to wrap any typo in anyOf right, I think.
Typecasting for simple types: First of all, there are less baked-in types but an integer validates a number schema. If you mean that a number like 12.0 can be casted into 12 during validation, I am not sure if it exists by the standard itself.
XSD has a type hierarchy (see https://www.w3.org/TR/xmlschema-2/#built-in-datatypes), I think JSON schema may have one for integer & number but not the rest.
Typecasting for complex types: Not sure what the example can be here
An example can be any hierarchy again
One can say I expect Person
and at runtime I can say it is a Student
.
Simple value with attribute: This was not part of some of the beginner tutorials but of course I see the uses. I am also surprised by the conversion but I think that this can have different results in JSON Schema since attribute is simply not supported in JSON. However, the same knowledge can be represented and we might have to prescribe how it should be done.
Yes, probably.
Conflicting names: Maybe depending on the result of solving the attribute description, this can be fine.
Mhh, since JSON has no attributes this is tricky ... maybe
Sequence of elements: I think this is a big deal since nothing in JSON Schema can constrain the order of object keys since there is no need to do so in JSON. A custom keyword with a JSON Schema vocabulary would be possible...
Agree
Sequence of elements, part 2: Well this is annoying since the key's value can have a different type based on its location. Even if we add a custom keyword to ensure order in JSON Schema objects, we cannot have two keys with the same name (or if we do, most JSON parsers take the second one)
I know.
any and anyAttribute: I think I did not understand this, can you elaborate?
Essentially in XSD one can say I expect some attribute (or element). As said similar to "additionalProperties" in JSON Schema. The difference is that I can say the any MUST be of a certain namespace... e.g., the attribute must be in the context "https://w3id.org/saref#" only. Having said that, other attributes are not allowed.
My assessment of the situation: In X percent of the cases we can have a JSON Schema representation. To me that X feels like 80% but then it is my opinion with not much experience with XML-based APIs.
I think the percentage is even higher.. but not sure either.
Call from 02.11:
Some initial findings / thoughts in a Gist
Personally I still see XML used in APIs nowadays (even though I don't have a good use-case to share). Anyhow I must admit that in most of the cases JSON is used.
Does this mean allowing to describe XML payloads is no longer a valid use case? I don't think so...
FYI: RAML (the API modelling language) allows for both formats, JSON and XML.
Given that we have data mapping in our charter, I am adding selected label.
Reviewing XML Binding template document I noticed that it is trying more to describe a way to serialize JSON data to XML rather than trying to describe existing XML documents with our data schema. Given that I can't really find any modern API that is returning XML data I think we should be more open and don't force to use a particular "serialization" pattern even for XML.
Taking for example this XML payload:
I think we can correctly map to this JSON object:
Consequently, knowing this mapping, we can define the correct data schema:
I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?