lestrrat-go / libxml2

Interface to libxml2, with DOM interface
MIT License
230 stars 55 forks source link

dom.SetNamespace creates `xmlns:` in child elements #102

Closed guitarpawat closed 3 months ago

guitarpawat commented 4 months ago

Describe the bug

In godoc of function dom.SetNamespace says that the XML namespace declaration will not be created if the namespace is already declared in a previous tree hierarchy. But currently, this function will create duplicate xmlns: in child elements

Please attach the output of go version

go version go1.22.4 linux/amd64

To Reproduce / Expected behavior Please attach a standalone Go test code that shows the problem, and what you expected to happen. The code:

package main

import (
    "fmt"
    "github.com/lestrrat-go/libxml2/dom"
)

func main() {
    root := dom.CreateDocument()
    doc, _ := root.CreateElement("Document")
    _ = doc.SetNamespace("https://example.com", "ex")
    _ = root.SetDocumentElement(doc)
    test, _ := root.CreateElement("Test")
    _ = test.SetNamespace("https://example.com", "ex")
    _ = doc.AddChild(test)
    test.SetNodeValue("test")
    fmt.Println(root.Dump(true))
}

Expected result:

<?xml version="1.0" encoding="utf-8"?>
<ex:Document xmlns:ex="https://example.com">
  <ex:Test>test</ex:Test>
</ex:Document>

Actual result:

<?xml version="1.0" encoding="utf-8"?>
<ex:Document xmlns:ex="https://example.com">
  <ex:Test xmlns:ex="https://example.com">test</ex:Test>
</ex:Document>
lestrrat commented 4 months ago

@guitarpawat Can you please check #103?

guitarpawat commented 4 months ago

@lestrrat Thanks for a quick response, it seems like your commit doesn't set parent element to the namespace

<?xml version="1.0" encoding="utf-8"?>
<Document>
  <ex:Test xmlns:ex="https://example.com">test</ex:Test>
</Document>

And it still have duplicate xmlns if we set the namespace for parent after child

func main() {
    root := dom.CreateDocument()
    doc, _ := root.CreateElement("Document")
    //_ = doc.SetNamespace("https://example.com", "ex") // delete
    _ = root.SetDocumentElement(doc)
    test, _ := root.CreateElement("Test")
    _ = test.SetNamespace("https://example.com", "ex")
    _ = doc.SetNamespace("https://example.com", "ex") // move to here
    _ = doc.AddChild(test)
    test.SetNodeValue("test")
    fmt.Println(root.Dump(true))
}
<ex:Document xmlns:ex="https://example.com">
  <ex:Test xmlns:ex="https://example.com">test</ex:Test>
</ex:Document>
lestrrat commented 4 months ago

@guitarpawat Oh, I found your problem. xmlSearchNs can only find duplicate namespaces if the node in question is already part of a document. You are setting the namespace and then adding the node, that's why it failed to find your namespace.

    t.Run("child ns after parent ns", func(t *testing.T) {
        root := CreateDocument()
        doc, _ := root.CreateElement("Document")
        _ = doc.SetNamespace("https://example.com", "ex")
        test, _ := root.CreateElement("Test")
        _ = doc.AddChild(test) // <---- THIS needs to go BEFORE SetNamespace
        _ = root.SetDocumentElement(doc)
        _ = test.SetNamespace("https://example.com", "ex")
        test.SetNodeValue("test")

        // root.Dump(true) should only contain one namespace declaration
        // for the 'ex' prefix
        dump := root.Dump(true)
        count := strings.Count(dump, `xmlns:ex="https://example.com"`)
        t.Logf("%s", dump)
        require.Equal(t, 1, count, "expected only one namespace declaration for 'ex' prefix")
    })

Also, it is NOT possible to find duplicates on children. So if you add a namespace on your child node and then add to a parent, that's not going to be fixed.

I added a minor fix to #103 including fixes to the test, and I believe it's working as intended now.

guitarpawat commented 4 months ago

Hi @lestrrat, I pulled your latest version from #103 (commit b1f63e9) and it is still not working for me

The xmlns is gone in child nodes but it also doesn't have a namespace prefix

package main

import (
    "fmt"
    "github.com/lestrrat-go/libxml2/dom"
)

func main() {
    root := dom.CreateDocument()
    doc, _ := root.CreateElement("Document")
    _ = root.SetDocumentElement(doc)
    _ = doc.SetNamespace("https://example.com", "ex")
    test, _ := root.CreateElement("Test")
    _ = doc.AddChild(test)
    _ = test.SetNamespace("https://example.com", "ex")
    test2, _ := root.CreateElement("Test2")
    _ = test.AddChild(test2)
    _ = test2.SetNamespace("https://example.com", "ex")

    fmt.Println(root.Dump(true))
}
<?xml version="1.0" encoding="utf-8"?>
<ex:Document xmlns:ex="https://example.com">
  <Test>
    <Test2/>
  </Test>
</ex: Document>

For clarification, I expect my message to look like this spec.

<ex:Document xmlns:ex="https://example.com">
  <ex:Test>
    <ex:Test2/>
  </ex:Test>
</ex:Document>

I managed to find a workaround by using CreateElementNS instead but it still has another issue with default namespace (will open another issue):

    root := dom.CreateDocument()
    doc, _ := root.CreateElementNS("https://example.com", "ex:Document")
    _ = root.SetDocumentElement(doc)
    test, _ := root.CreateElementNS("https://example.com", "ex:Test")
    _ = doc.AddChild(test)
    test2, _ := root.CreateElementNS("https://example.com", "ex:Test2")
    _ = test.AddChild(test2)

    fmt.Println(root.Dump(true))
lestrrat commented 4 months ago

Oh, oops?

~/dev/src/github.com/lestrrat-go/libxml2$ git push origin gh-102
To github.com:lestrrat-go/libxml2.git
 ! [rejected]        gh-102 -> gh-102 (fetch first)
error: failed to push some refs to 'github.com:lestrrat-go/libxml2.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

How about now?

guitarpawat commented 4 months ago

@lestrrat I pulled commit 38fc65c32878 Child nodes still don't have prefix.

package main

import (
    "fmt"
    "github.com/lestrrat-go/libxml2/dom"
)

func main() {
    root := dom.CreateDocument()
    doc, _ := root.CreateElement("Document")
    _ = doc.SetNamespace("https://example.com", "ex")
    test, _ := root.CreateElement("Test")
    _ = doc.AddChild(test) // <---- THIS needs to go BEFORE SetNamespace
    _ = root.SetDocumentElement(doc)
    _ = test.SetNamespace("https://example.com", "ex")
    test.SetNodeValue("test")
    fmt.Println(root.Dump(true))
}
<?xml version="1.0" encoding="utf-8"?>
<ex:Document xmlns:ex="https://example.com">
  <Test>test</Test>                            <- this should be ex:Test
</ex:Document>
lestrrat commented 4 months ago

meh. Okay, I will be offline for a few days, so I'll work on it when I come back.

guitarpawat commented 4 months ago

Got it, thank you!

github-actions[bot] commented 3 months ago

This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 3 months ago

This issue was closed because it has been stalled for 7 days with no activity. This does not mean your issue is rejected, but rather it is done to hide it from the view of the maintains for the time being. Feel free to reopen if you have new comments