Closed jamra closed 10 years ago
You can find all text nodes with "//text()"
.
For example, this:
var html = `<html><body><div>a</div><div>b</div></body></html>`
func run() error {
path, err := xmlpath.Compile("//text()")
if err != nil {
return err
}
root, err := xmlpath.ParseHTML(bytes.NewReader([]byte(html)))
if err != nil {
return err
}
iter := path.Iter(root)
for iter.Next() {
fmt.Println(iter.Node())
}
return nil
}
Outputs:
a
b
Thanks Gustavo.
Okay. I'm still getting some strange output like script nodes. I have no idea how to filter them out.
When I try to iterate through the root node obtained from ParseHTML, I cannot use any of the unexported properties such as the nodes array or the kind property. How can I find all of the text nodes?
I am trying to iterate through the hierarchy of nodes and filter out the text to be used later for searching.