ballerina-platform / ballerina-library

The Ballerina Library
https://ballerina.io/learn/api-docs/ballerina/
Apache License 2.0
136 stars 64 forks source link

`data.xmldata` Module Fails to Deserialize Semantically Equivalent XMLs to Records #6725

Closed AzeemMuzammil closed 3 months ago

AzeemMuzammil commented 4 months ago

Description:

The xmldata module in Ballerina fails to correctly deserialize XMLs to records when only the namespace prefixes are changed, despite the XMLs being semantically equivalent. This issue occurs when deserializing an XML with a different namespace prefix than what was used during serialization, even though the namespace URIs are the same.

Consider the following XML and record:

Initial XML

<root xmlns:ex="http://www.example.com/schema">
    <ex:item>Example Item</ex:item>
</root>

Corresponding Record:

@xmldata:Name {value: "root"}
type Root record {
    @xmldata:Namespace {prefix: "ex", uri: "http://www.example.com/schema"}
    string item;
};

Serialization:

Root root = {
    item: "Example Item"
};
xml|error xmlData = xmldata:toXml(root);
// This works fine;

If I change the resulting XML to:

<root xmlns:abc="http://www.example.com/schema">
    <abc:item>Example Item</abc:item>
</root>

Deserialization:

Root newRoot = check xmldata:parseAsType(xmlData);

Actual Result: Deserialization fails with the error

Expected Behavior: When the XML documents are semantically equivalent (identical URIs, differing only in prefix), the xmldata module should successfully serialize and deserialize the XML without errors.

Steps to reproduce:

  1. Define a Ballerina record with namespace information.
  2. Serialize the record to XML using xmldata:toXml().
  3. Modify the resulting XML by changing the namespace prefix (but not the URI).
  4. Attempt to deserialize the modified XML back to the record using xmldata:parseAsType().

Affected Versions:

OS, DB, other environment details and versions:

Related Issues (optional):

Suggested Labels (optional):

Suggested Assignees (optional):

hasithaa commented 4 months ago

I think the deserialization behavior is correct, as per the discussion here, https://github.com/ballerina-platform/ballerina-spec/issues/1268. both xml values are not ==. Based on this we design the data.xmldata annotations. Maybe we can have the less restrictive version which can be enabled via an option.

hasithaa commented 4 months ago

Since this is one of the common use cases, we can relax strict equality, and make the semantic equality work for the common case by default. To enable strict equality we can provide an option. @prakanth97 WDYT?

prakanth97 commented 4 months ago

Since this is one of the common use cases, we can relax strict equality, and make the semantic equality work for the common case by default. To enable strict equality we can provide an option. @prakanth97 WDYT?

Agree. Seems like better option.