anaskhan96 / soup

Web Scraper in Go, similar to BeautifulSoup
MIT License
2.18k stars 168 forks source link

Debug mode, check if element is found and correct comments #10

Closed danilopolani closed 7 years ago

danilopolani commented 7 years ago

According to #7, with this merge you will be able to:

  1. Check if the element is found with the Error field in the Root struct;
  2. Toggle debug mode with the SetDebug() function. Default is false, if set to true will show the various panic().

Example to check if the node is found (no panic will appear in the terminal):

source := soup.HTMLParse(resp)
articles := source.Find("section", "class", "loop").FindAll("article")
for _, article := range articles {
    link := article.Find("h2").Find("a")
    if link.Error == nil { // link is an instance of Root
        fmt.Println(link.Text(), "| Link :", link.Attrs()["href"])
    }
}

Example to check if the node is found with debug mode (panic will appear in terminal):

soup.SetDebug(true)

source := soup.HTMLParse(resp)
articles := source.Find("section", "class", "loop").FindAll("article")
for _, article := range articles {
    link := article.Find("h2").Find("a")
    if link.Error == nil { // link is an instance of Root
        fmt.Println(link.Text(), "| Link :", link.Attrs()["href"])
    }
}

Notes:

anaskhan96 commented 7 years ago

Thank you for your PR!

SetDebug() function seems like a good addition to the package. Found as a field of Root however, kind of deviates from what we were discussing in #7 . Instead, an Error field returning an error (if occurred, nil otherwise), seems to lean more, if not completely, towards idiomatic Go. For example:

link := doc.Find("h2").Find("a")
if link.Error != nil {
    // handle the error
}

Node, Root, FindNextSibling, and FindPrevSibling do need proper comments for documentation which I'll be looking into.

danilopolani commented 7 years ago

Yes, I thought about Error which is more Goish, but Found is more logic-driven (element found or not) as boolean, anyway tomorrow I'll change it to the error type.

After this PR, I have an idea about the other open issue, let's see if it works!

danilopolani commented 7 years ago

Ok, I replaced Found with Error and I updated the PR description. Let me know!

anaskhan96 commented 7 years ago

Looks good. Merged!