antchfx / htmlquery

htmlquery is golang XPath package for HTML query.
https://github.com/antchfx/xpath
MIT License
723 stars 73 forks source link

substring-after() is not being executed #35

Closed plamen-nikolov closed 5 months ago

plamen-nikolov commented 3 years ago

I tried a few expressions with substring-after() and it seems the functions is not being executed at all. Tried to debug func.go and substringIndFunc is being called, returns a callable which is never called though.

Example expression: substring-after(//span[@class="pageNumbersInfo"]//text(), "of ") Node: Pages 1 of 25

zhengchun commented 3 years ago

You need provide more information to help me to debug.

c4tz commented 2 years ago

Hi, I just came across this problem, too.

Here is a minimal (not) working example:

package main

import (
    "log"
    "strings"

    "github.com/antchfx/htmlquery"
)

func main() {
    s := `<html>
    <head></head>
    <body>
        <div id="foo-12345"></div>
    </body>
</html>`
    xpath := "substring-after(//div/@id, 'foo-')"
    doc, err := htmlquery.Parse(strings.NewReader(s))
    if err != nil {
        log.Fatal(err)
    }
    results, err := htmlquery.QueryAll(doc, xpath)
    if err != nil {
        log.Fatal(err)
    }
    for _, r := range results {
        log.Print(r)
    }
}

When using the exact same XPath expression (and HTML) in the chromium dev tools, I get 12345 back as a result, but htmlquery does not seem to find anything.

zhengchun commented 2 years ago

@c4tz , substring-after() is a function, and return a string value not NODE type. your substring-after(//div/@id, 'foo-') is telling package to execute this function and returning a string value.

Compare the following two examples:

v := xpath.MustCompile("substring-after(//div/@id, 'foo-')").Evaluate(htmlquery.CreateXPathNavigator(doc))
if v != nil {
    fmt.Println(v.(string))  // output: 12345
}

The below code is return Node values.

    results, err := htmlquery.QueryAll(doc, "//div[substring-after(@id, 'foo-')]")
    if err != nil {
        log.Fatal(err)
    }
    for _, r := range results {
        log.Print(r)
    }
c4tz commented 2 years ago

Ahh :bulb: Thank you for the clarification! Then it just was a misunderstanding on my side, sorry.