antchfx / xmlquery

xmlquery is Golang XPath package for XML query.
https://github.com/antchfx/xpath
MIT License
444 stars 89 forks source link

Output XML does not preserve whitespace and this is no way configurable #66

Closed max-carroll closed 2 years ago

max-carroll commented 3 years ago

the private function has a preserveSpaces option but this is not exposed in the private method,

I feel like this is a basic feature which would be universally required, can we expose this parameter to the Public function?

outputXML(buf bytes.Buffer, n Node, preserveSpaces bool) {

our application turns xml into html and therefore spaces needs to be preserved, as a temp workaround we've hardcoded this value to true, however it seems like a more sustainable approach if this were configurable somehow or whether it could be an extra parameter

max-carroll commented 3 years ago

Another option would be outputXMLPreserveWhiteSpaces(buf ... Node ....) so then we don't have to change the signature of the other method, If anyone agrees with this I can create a PR for that

zhengchun commented 3 years ago

You can create PR to fix this, but it should be compatible with previous versions and not change the signature of already used method.

nbibler commented 3 years ago

I ran into this issue, as well.

I guess I'm ultimately curious about why this package would even prefer to manipulate the whitespace of nodes in the output XML at all. Mutating XML node values is generally frowned upon unless there's an XSD or XSLT provided that explicitly allows or instructs for it.

zhengchun commented 3 years ago

@nbibler ,hi, would your provides a some example? I remembered this package will not process whitespace of node when output XML. see

https://github.com/antchfx/xmlquery/blob/3abcaf968a97273621eca704d570cfe8d9f03d5c/node.go#L79

nbibler commented 3 years ago

@zhengchun: Maybe something like this?

func TestOutputXMLWithParentAndMixedContent(t *testing.T) {
    s := `<?xml version="1.0" encoding="utf-8"?>
    <class_list>
        <student xml:space="preserve">
            <name>Robert</name>
            A+
            B-
        </student>
    </class_list>`
    doc, _ := Parse(strings.NewReader(s))
    t.Log(doc.OutputXML(true))

    n := FindOne(doc, "/class_list/student")
    expected := "<name> Robert </name>\n\t\t\tA+\n\t\t\tB-\n\t\t"
    if g := doc.OutputXML(true); strings.Index(g, expected) == -1 {
        t.Errorf(`expected "%s", obtained "%s"`, expected, g)
    }
}
--- FAIL: TestOutputXMLWithParentAndMixedContent (0.00s)
    node_test.go:352: <><?xml version="1.0" encoding="utf-8"?><class_list><student xml:space="preserve"><name>Robert</name>A+&#xA;&#x9;&#x9;&#x9;B-</student></class_list></>
    node_test.go:357: expected "<name> Robert </name>
                                A+
                                B-
        ", obtained "<><?xml version="1.0" encoding="utf-8"?><class_list><student xml:space="preserve"><name>Robert</name>A+&#xA;&#x9;&#x9;&#x9;B-</student></class_list></>"
    node_test.go:362: the outputted xml contains newlines
    node_test.go:364: <name>Robert</name>A+&#xA;&#x9;&#x9;&#x9;B-
FAIL
FAIL    github.com/antchfx/xmlquery     0.164s
FAIL
zhengchun commented 2 years ago

Please using xml:space="preserve" statement in your XML file that will keeping WHITESPACE when you calling OutputXML() method to print XML. https://github.com/antchfx/xmlquery/pull/12