w3c / wot-binding-templates

Web of Things (WoT) Binding Templates
http://w3c.github.io/wot-binding-templates/
Other
22 stars 25 forks source link

XML template is more for green field devices #139

Open relu91 opened 2 years ago

relu91 commented 2 years ago

Reviewing XML Binding template document I noticed that it is trying more to describe a way to serialize JSON data to XML rather than trying to describe existing XML documents with our data schema. Given that I can't really find any modern API that is returning XML data I think we should be more open and don't force to use a particular "serialization" pattern even for XML.

Taking for example this XML payload:

<SaveContactResponse xmlns="http://schemas.datacontract.org/2004/07/SmashFly.WebServices.ContactManagerService.v2"› 
    <ContactId>2147483647</ContactId>
    <Errors>
        <string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">Error 1</string>
        <string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">Error 2</string>
    </Errors>
    <HasErrors>true</HasErrors>
</SaveContactResponse>

I think we can correctly map to this JSON object:

{
  SaveContactResponse: {
    ContectId : 2147483647,
    Errors: [ { string: "Error 1" }, { string: "Error 2"} ],
    HasErrors: true,
  }
}

Consequently, knowing this mapping, we can define the correct data schema:

{
  type: "object"
  properties: {
    "SaveContactResponse": { type: "object" /* etc. */}
  }
}

I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?

danielpeintner commented 2 years ago

I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?

The current algorithm is essentially inspired by EXI for JSON.

It maps JSON to XML.

What you are describing is mapping XML to JSON. I think this is not possible in a generic way since XML is more powerful, e.g., XML has the power to

Hence, I don't think this will ever work consistently..

w.r.t. describe XML to JSON conversion I am not sure how to move on from here best.

egekorkan commented 2 years ago

In my own defense I have just copied from @takuki's initial contribution and worked with XML only in configuration files and not in APIs, so I do not have a good opinion.

@danielpeintner is it possible to say that we can always define a loose schema which would validate more XML payloads than intended.

By the way, I had asked the JSON Schema community about documented support for such cases and the answer is no. There is a possible clean way with https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.8.5 where we can clearly say that a schema is only for this xml payload.

relu91 commented 2 years ago

I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?

The current algorithm is essentially inspired by EXI for JSON.

It maps JSON to XML.

What you are describing is mapping XML to JSON. I think this is not possible in a generic way since XML is more powerful, e.g., XML has the power to

  • have attributes and character for a simple type element <ContactId hasFoo="true">2147483647</ContactId>
  • well-defined sequences of elements ...
  • use xsi:type casting
  • use nillable in an instance to indicate that there is no content..
  • ...

Hence, I don't think this will ever work consistently..

w.r.t. describe XML to JSON conversion I am not sure how to move on from here best.

I see, what if we describe a default algorithm that converts attributes/well-defined seq/xsi:type/nillable etc. and then configure it using XML protocol binding vocabulary terms? For example:

Just some additional options....

By the way, I had asked the JSON Schema community about documented support for such cases and the answer is no. There is a possible clean way with https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.8.5 where we can clearly say that a schema is only for this xml payload.

yeah but this is really almost like no validation at all... it just says put everything inside a string and treat it as XML.

relu91 commented 2 years ago

Found also this document: https://www.w3.org/2011/10/integration-workshop/s/ExperienceswithJSONandXMLTransformations.v08.pdf

It defines "friendly xml". maybe we can have the rule to map only "friendly XML" and fall back to string if it is unfriendly. just throwing ideas on the table...

danielpeintner commented 2 years ago

Many people nowadays run into the need to support JSON & XML at the same time. Even XQuery, which used a query language for XML has support for JSON nowadays.

Having said that, I don't think it is possible to properly support XML validation based on JSON schema. Let me throw in another proposal. Since I believe proper XML validation needs XML schema. WHAT if we use real XML schema.

XSD example for a person

<xs:element name="person">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="firstname" type="xs:string"/>
      <xs:element name="lastname" type="xs:string"/>
      <xs:any minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element> 

Would it be feasible to represent the XSD in JSON schema... such as

{
  "title": "Person",
  "type": "string",
  "format": "xml",
  "format-constraints": "<xs:element name=\"person\"> ..."
}

The keys format and format-constraints are just there to highlight what I mean. We can use whatever term we want. We should just make sure that JSON validators can handle it properly.. in the worst case a validator should accept any string (or respectively any XML). Others validators that support proper XML validation can use the information given in format-constraints to validate...

Moreover, this allows to support other formats in the future also with the same principle.

What do you think?

relu91 commented 2 years ago

I think it is an option. One downside that I see is that from the application point of view we are just talking about strings. It would be better to still maintain the ability to properly describe a formal data type. Basically, it works well for validating the payload but not to "transform"/"map"/"convert" to a consistent DataSchema value.

egekorkan commented 2 years ago

From the call of 15.12:

Decision: @danielpeintner will provide XML, XML schema examples where these corner cases exist. These edge cases should not require the expansion of the data schema. We will document how implementors should handle these cases but we need some examples and further iteration on this.

danielpeintner commented 2 years ago

Let me give you some examples where I think it is difficult to represent XML as JSON.

Note1: I am not saying these are good examples. I just want to show that I think XML is more expressive. Note2: I tried to use https://www.freeformatter.com/xml-to-json-converter.html to experiment a bit and some results are a bit surprising to me.

In the end I don't know whether the examples I give below are good practice... I don't think so.. in most of the cases.

Nillable and Type-Casts

XML allows to type-cast types to a subtype. For example an xsd:decimal can by typed to xsd:integer or xsd:unsignedInt in the instance document. TypeCasts are also possible with complexTypes also.

Moreover, any element can be marked as nillable in an XML instance (xsi:nil="true") if the schema contains nillable="true" for this element

Simple Content with attributes

XSD allows to specify simple values like a simpleValue types as integer but still have an attribute. This would need to be mapped to an object in JSON schema.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="simpleValue">
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base="xs:integer">
          <xs:attribute name="id" use="required" type="xs:string"/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>
</xs:schema>

-->

<simpleValue id="dd">12</simpleValue>

The online tool converts it to

{
   "@id": "dd",
   "#text": "12"
}

Which is surprising to me.

Conflicting names (attributes vs element)

value is once an attribute and the other time a element name and can have different type.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="conflictingNames">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="value" type="xs:integer" />
      </xs:sequence>
      <xs:attribute name="value" type="xs:string"/>
    </xs:complexType>
  </xs:element>
</xs:schema>

-->

<conflictingNames value="XX">
    <value>12</value>
</conflictingNames>

Sequence of elements

In complex types by default using xsd:sequence the order of element matters. In JSON this is not the case. Anyhow, I don't think this is a big deal. What I am not sure about is if we have the same element name declared twice. Not sure if this is actually an issue

  <xs:element name="orderWithDifferentElements">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="value" type="xs:integer" />
        <xs:element name="value" type="xs:integer" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

any and anyAttribute

XSD allows to specify any element or anyAttribute to appear. I think this is covered by JSON schema and "additionalProperties": true

What is not covered is the feature than anyX can be limited to a given namespace.

egekorkan commented 2 years ago

I had to learn a bit of XML Schema to follow, so I am not sure if my answers are really correct.

My assessment of the situation: In X percent of the cases we can have a JSON Schema representation. To me that X feels like 80% but then it is my opinion with not much experience with XML-based APIs.

danielpeintner commented 2 years ago

Nillable: This is possible with JSON Schema if I say anyOf and give type:null for one of the options

In XSD you can use that for any type. With JSON schema you would need to wrap any typo in anyOf right, I think.

Typecasting for simple types: First of all, there are less baked-in types but an integer validates a number schema. If you mean that a number like 12.0 can be casted into 12 during validation, I am not sure if it exists by the standard itself.

XSD has a type hierarchy (see https://www.w3.org/TR/xmlschema-2/#built-in-datatypes), I think JSON schema may have one for integer & number but not the rest.

Typecasting for complex types: Not sure what the example can be here

An example can be any hierarchy again

grafik

One can say I expect Person and at runtime I can say it is a Student.

Simple value with attribute: This was not part of some of the beginner tutorials but of course I see the uses. I am also surprised by the conversion but I think that this can have different results in JSON Schema since attribute is simply not supported in JSON. However, the same knowledge can be represented and we might have to prescribe how it should be done.

Yes, probably.

Conflicting names: Maybe depending on the result of solving the attribute description, this can be fine.

Mhh, since JSON has no attributes this is tricky ... maybe

Sequence of elements: I think this is a big deal since nothing in JSON Schema can constrain the order of object keys since there is no need to do so in JSON. A custom keyword with a JSON Schema vocabulary would be possible...

Agree

Sequence of elements, part 2: Well this is annoying since the key's value can have a different type based on its location. Even if we add a custom keyword to ensure order in JSON Schema objects, we cannot have two keys with the same name (or if we do, most JSON parsers take the second one)

I know.

any and anyAttribute: I think I did not understand this, can you elaborate?

Essentially in XSD one can say I expect some attribute (or element). As said similar to "additionalProperties" in JSON Schema. The difference is that I can say the any MUST be of a certain namespace... e.g., the attribute must be in the context "https://w3id.org/saref#" only. Having said that, other attributes are not allowed.

My assessment of the situation: In X percent of the cases we can have a JSON Schema representation. To me that X feels like 80% but then it is my opinion with not much experience with XML-based APIs.

I think the percentage is even higher.. but not sure either.

egekorkan commented 1 year ago

Call from 02.11:

danielpeintner commented 1 year ago

Some initial findings / thoughts in a Gist

danielpeintner commented 7 months ago

Personally I still see XML used in APIs nowadays (even though I don't have a good use-case to share). Anyhow I must admit that in most of the cases JSON is used.

Does this mean allowing to describe XML payloads is no longer a valid use case? I don't think so...

FYI: RAML (the API modelling language) allows for both formats, JSON and XML.

egekorkan commented 7 months ago

Given that we have data mapping in our charter, I am adding selected label.