antchfx / htmlquery

htmlquery is golang XPath package for HTML query.
https://github.com/antchfx/xpath
MIT License
723 stars 73 forks source link

[bug] htmlquery.QueryAll not returning all node attribute results #49

Closed bnkai closed 2 years ago

bnkai commented 2 years ago

It seems that the nodup fix from #6 causes attributes that have the same value with the first result to be discarded Example (only used for testing) taken from #20 and adjusted

package main

import (
        "fmt"
        "strings"

        "github.com/antchfx/htmlquery"
)

func main() {
        t := []string{
                "<span class='output' sup='24'></span><span class='output' sup='24'></span><span class='output' sup='45'></span>",
                "<span class='output' sup='24'></span><span class='output' sup='45'></span><span class='output' sup='45'></span>",
                "<span class='output' sup='45'></span><span class='output' sup='24'></span><span class='output' sup='45'></span>",
        }

        for _, s := range t {
                doc, _ := htmlquery.Parse(strings.NewReader(s))
                nx, _ := htmlquery.QueryAll(doc, "//span[@class='output']/@sup")
                for _, n := range nx {
                        fmt.Println(htmlquery.InnerText(n))
                }
                fmt.Println("--")
        }
}

Output

24
45
--
24
45
45
--
45
24
--

Expected Output

24
24
45
--
24
45
45
--
45
24
45
--
zhengchun commented 2 years ago

Thanks report, I removed the duplicate node checking. It can works now.

bnkai commented 2 years ago

That was fast! Thanks for the fix.