antchfx / htmlquery

htmlquery is golang XPath package for HTML query.
https://github.com/antchfx/xpath
MIT License
723 stars 73 forks source link

Question: How I can remove specific child from parent so when I do InnerText I don't get the text of the child #73

Closed greenoctopus20 closed 5 months ago

greenoctopus20 commented 5 months ago

I have a list of nodes to include (parents) and a list of nodes to excludes (children) I want to get the text of parents excluding the text of children if the node is child of one of the included nodes

at the moment I'm relaying on replacing the child's inner text to " " in the parent text but it's not really what I'm looking for if there's a way to kill the child node to Nil or delete it from the Tree without effecting other child nodes

my approach at the moment

func CombineInclExclToTxt(includes, excludes []*html.Node) string {
    includeText := ""
    for _, node := range includes {
        includeText += htmlquery.InnerText(node)
    }

    for _, excl := range excludes {
        excludeString := htmlquery.InnerText(excl)
        includeText = strings.Replace(includeText, excludeString, "", -1)
    }
    return includeText
}

Thank you

greenoctopus20 commented 5 months ago

Nevermind, found solution net/html package