NaturalIntelligence / fast-xml-parser

Validate XML, Parse XML and Build XML rapidly without C/C++ based libraries and no callback.
https://naturalintelligence.github.io/fast-xml-parser/
MIT License
2.49k stars 302 forks source link

Hex value in CDATA being parsed #535

Closed estanglerbm closed 1 year ago

estanglerbm commented 1 year ago

Description

CDATA hex value (i.e. CDATA[0x1]) is being parsed and outputted as non-hex value.

Input

<A><![CDATA[0x1]]></A>

Code

    const options = {};  
    const test_xml:string = "<A><![CDATA[0x1]]></A>";
    const parser = new XMLParser( options );
    const result = parser.parse( test_xml );
    console.log( `showBug: ${JSON.stringify(result)}` );

Output

showBug: {"A":1}

expected data

showBug: {"A":0x1} or showBug: {"A":"0x1"}

Would you like to work on this issue?

Bookmark this repository for further updates.

github-actions[bot] commented 1 year ago

I'm glad you find this repository helpful. I'll try to address your issue ASAP. You can watch the repo for new changes or star it.

estanglerbm commented 1 year ago

I should say that this worked in v3. It works in the online tool for some reason. And the numberParseOptions options mentioned in #513 is a workaround, but I don't know how good that is. Being CDATA, it shouldn't even need these options.

amitguptagwl commented 1 year ago

In case of CDATA, it should be ignored. Will have to check why it is being rendered.

amitguptagwl commented 1 year ago

Please read the document

const options = {
            cdataPropName: "cdata"
        };
estanglerbm commented 1 year ago

I understand that cdataPropName would be a workaround (not a solution) for this test case. But the test case is a simplified version of code that worked in v3, where CDATA is mixed with non-CDATA, and can't use cdataPropName because there's no way to reconstruct how the CDATA and non-CDATA are mixed together if they are separated. Even if they are merged by the parser, CDATA shouldn't be interpreted like this.

estanglerbm commented 1 year ago

Well, actually, if the CDATA text is merged, then it would be interpreted like a non-CDATA text would be. I guess my issue is really that there is hex number parsing happening at all (so that numberParseOptions options are needed to disable that). I don't remember that happening in v3, by default.

amitguptagwl commented 1 year ago

initially when it was designed, we kept cdata property to set properly when CDATA doesn't have to be ignored. So if it is set then it's value must not be parsed. Otherwise, it should be parsed before merge to the parent tag value.