NaturalIntelligence / fast-xml-parser

Validate XML, Parse XML and Build XML rapidly without C/C++ based libraries and no callback.
https://naturalintelligence.github.io/fast-xml-parser/
MIT License
2.45k stars 296 forks source link

Parser must skip unopened closing tags #577

Open varad11 opened 1 year ago

varad11 commented 1 year ago

Purpose / Goal

When an xml/html has closing tags without opening tags, then the parser should skip it and continue it's parsing rather than blocking execution by throwing exception.

Code

new XMLParser({ ignoreAttributes: false })
.parse('<rootNode>
<parentTag attr='my attr 1'>
    <childTag>Hello</childTag>
</parentTag>
<parentTag attr='my attr 2'>
    </childTag> <!--Unopned Closing Tag-->
    </childTag> <!--Unopned Closing Tag-->
</parentTag>
</parentTag> <!--Unopned Closing Tag-->
<parentTag attr='my attr 3'>
    <childTag>World</childTag>
</parentTag>
</rootNode>')

Output

Uncaught TypeError TypeError: Cannot read properties of undefined (reading 'addChild')

Expected Output

{
    "rootNode": {
        "parentTag": [
            {
                "childTag": "Hello",
                "@_attr": "my attr 1"
            },
            {
                "@_attr": "my attr 2"
            },
            {
                "childTag": "World",
                "@_attr": "my attr 3"
            }
        ]
    }
}

Benchmark

Before Changes

Running Suite: XML Parser benchmark
fxp v3 : 84779.33893486593 requests/second
fxp : 51112.199678104116 requests/second
fxp - preserve order : 50404.274179940876 requests/second       
xmlbuilder2 : 19283.905598973546 requests/second
xml2js  : 13955.238383909073 requests/second

After Changes

Running Suite: XML Parser benchmark
fxp v3 : 78604.65479491939 requests/second
fxp : 47949.98355614185 requests/second
fxp - preserve order : 51482.93159290693 requests/second
xmlbuilder2 : 20161.459509073473 requests/second
xml2js  : 14325.551131080862 requests/second

Type

Please mention the type of PR

amitguptagwl commented 1 year ago

Thanks for the PR. But the merge can be delayed due to major change in the library are in progress. As a part of new development, there is a plan to introduce strict property. If it is false then warning will be issued but the parsing will continue.

And this is not the bug fix bu a feature :)