xuri / xgen

XSD (XML Schema Definition) parser and Go/C/Java/Rust/TypeScript code generator
BSD 3-Clause "New" or "Revised" License
313 stars 74 forks source link

xs:union is interpreted wrongly #64

Open mpkondrashin opened 1 year ago

mpkondrashin commented 1 year ago

Description If XSD contains xs:union in type definition, it produces struct that contains both values. As result, XML fails to be parsed Steps to reproduce the issue:

  1. Example XSD:
    <?xml version="1.0" encoding="UTF-8"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:simpleType name="integer-or-empty">
    <xs:union memberTypes="xs:integer empty-string" />
    </xs:simpleType>
    <xs:simpleType name="empty-string">
    <xs:restriction base="xs:string">
      <xs:enumeration value="" />
    </xs:restriction>
    </xs:simpleType>
    <xs:element name="NETWORK_ACCESS_URL">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="DomainCensus" maxOccurs="1" minOccurs="0">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string">
                <xs:attribute type="integer-or-empty" name="AccessCount"/>
              </xs:extension>
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
    </xs:element>
    </xs:schema>
  2. Run following command
    xgen -i filename.xsd -o report.go -l Go -p report
  3. Following Go code is generated:
    
    // Code generated by xgen. DO NOT EDIT.

package report

import ( "encoding/xml" )

// Integerorempty ... type Integerorempty struct { XMLName xml.Name xml:"integer-or-empty" Emptystring *Emptystring Integer int }

// Emptystring ... type Emptystring string

// DomainCensus ... type DomainCensus struct { AccessCountAttr *Integerorempty xml:"AccessCount,attr,omitempty" Value string xml:",chardata" }

// NETWORKACCESSURL ... type NETWORKACCESSURL struct { XMLName xml.Name xml:"NETWORK_ACCESS_URL" DomainCensus *DomainCensus xml:"DomainCensus" }

4. Here is sample XML to parse (report.xml):
```xml
<?xml version="1.0" encoding="UTF-8"?>
<NETWORK_ACCESS_URL>
  <DomainCensus AccessCount="149511825">84786</DomainCensus>
</NETWORK_ACCESS_URL>
  1. Test code (file report_test.go):
    
    package report

import ( "encoding/xml" "os" "testing" )

func TestReport(t *testing.T) { data, err := os.ReadFile("report.xml") if err != nil { t.Fatal(err) } var report NETWORKACCESSURL if err := xml.Unmarshal(data, &report); err != nil { t.Fatal(err) } }

**Describe the results you received:**
go test produces the following:

--- FAIL: TestReport (0.00s) report_test.go:16: cannot unmarshal into report.Integerorempty FAIL exit status 1 FAIL github.com/mpkondrashin/ddan/report/t 0.251s

**Describe the results you expected:**
I have expected to parse file successfully
**Output of `go version`:**
```text
go version go1.18.3 darwin/amd64

xgen version or commit ID:

xgen version: 0.1.0

Environment details (OS, physical, etc.): MacOS Monterey

Possible solution I have no clue, how to solve it as unions of types are not supported by Golang (not sure about other languages).

What worked for me: Changing Integerorempty to int through whole report.go file (using Search & Replace) fixes this issue

I assume some option to force some types to be substituted as other types would be the solution. See https://github.com/xuri/xgen/issues/10 issue. This option would solve this issue too (interpret xsd "integer" as int, int64 or big.Int as user whats it to)