antchfx / xmlquery

xmlquery is Golang XPath package for XML query.
https://github.com/antchfx/xpath
MIT License
444 stars 89 forks source link

No error or panic when parsing non-XML #86

Open tuan-nxcr opened 2 years ago

tuan-nxcr commented 2 years ago

I was expecting Parse or Query to error out if I were to pass in something completely invalid, but it falls all the way through instead:

func TestXmlParse_1(t *testing.T) {
    //s := `<note>
    //<to>John</to>
    //<from>Smith</from>
    //<heading>Reminder</heading>
    //<body>Don't forget me this weekend!</body>
    //</note>`

    s := `{"NotXml":"ActuallyJson"}`

    parse, err := xmlquery.Parse(strings.NewReader(s))
    if err != nil {
        println(err.Error())
        t.Fail()
    }
    query, err := xmlquery.Query(parse, "//body/text()")
    if err != nil {
        println(err.Error())
        t.Fail()
    }
    println(query.Data)

}

using:

zhengchun commented 2 years ago

Add if query != nil {} to avoid throw a query is nil exception before println(query.Data)

tuan-nxcr commented 2 years ago

@zhengchun this is more of a question why Parse can proceed without any error when I give it invalid XML, not about how to handle a nil pointer exception.

zhengchun commented 2 years ago

Sorry. In parse processing, we using https://pkg.go.dev/encoding/xml#NewDecoder to parsing, no extract additional method to check the input document whether is XML or JSON.

tuan-nxcr commented 2 years ago

@zhengchun

Thanks for reopening! Yes, I was just looking for some way to validate that we have at least well-formed XML before proceeding to parsing (similar to what is recommended by the authors of the gjson package https://github.com/tidwall/gjson#validate-json).

My workaround for this right now is to precede the xmlquery.Parse step with:

func isXml(someString string) bool {
    return xml.Unmarshal([]byte(someString), new(interface{})) == nil
}

(the above was inspired by this Stackoverflow answer, which I will say is quite unintuitive)