antchfx / htmlquery

htmlquery is golang XPath package for HTML query.
https://github.com/antchfx/xpath
MIT License
723 stars 73 forks source link

FindOne with expression containing groups only works the first time #56

Closed JWAlberty closed 1 year ago

JWAlberty commented 1 year ago

If I do an expression like (//*/div) it works the first time but subsequent calls with the same expression return nil. They always return the same result however if i simply drop the paranthesis. This is problematic because I need the paranthesis do pick an element by its index (//*/div)[1]. This seems to have been introduced in the fix for #42 which was done in version 1.2.0 of the xpath library that htmlquery uses. Here's a sample piece of code where it fails.

package main

import (
    "fmt"
    "strings"

    "github.com/antchfx/htmlquery"
    "golang.org/x/net/html"
)

func main() {
    fmt.Printf("Run 1: %#v\n", getDiv(`(//*/div)`)) // works
    fmt.Printf("Run 2: %#v\n", getDiv(`(//*/div)`)) // fails

    fmt.Printf("Run 1: %#v\n", getDiv(`//*/div`)) // works
    fmt.Printf("Run 2: %#v\n", getDiv(`//*/div`)) // also works
}

func getDiv(selector string) *html.Node {
    s := `<html><head></head><body><div>a</div></body>`
    doc, err := htmlquery.Parse(strings.NewReader(s))
    if err != nil {
        panic(err)
    }
    return htmlquery.FindOne(doc, selector)
}
zhengchun commented 1 year ago

Thanks for report. Just update your xpath package to v1.2.2 to fix this.

https://github.com/antchfx/xpath/releases/tag/v1.2.2

JWAlberty commented 1 year ago

@zhengchun thank you. Just tested 1.2.2 in my project and can confirm v1.2.2 fixes it