gocolly / colly

Elegant Scraper and Crawler Framework for Golang
https://go-colly.org/
Apache License 2.0
23.09k stars 1.76k forks source link

Add OnFinally support #305

Open icamys opened 5 years ago

icamys commented 5 years ago

Would be nice to have an OnFinally handler that will be executed after both OnError and OnScraped handlers

jredl-va commented 5 years ago

@icamys isn't OnScraped already your OnFinally handler? For reference if you look at the order of things: https://github.com/vendasta/web-crawler/blob/786b1eb775942f5dd8432c6c3b6015b1e89a9d87/vendor/github.com/gocolly/colly/colly.go#L551

OnRequest is functionally equivalent to --> OnInit OnScraped is functionally equivalent to --> OnFinally

icamys commented 5 years ago

@jredl-va My bad. I was returning a custom error in RedirectHandler and this made colly to stop working. After inspecting sources I found out that OnScraped is reachable after OnError. Thank you for your time!

icamys commented 5 years ago

@jredl-va I got a question though: when I get "no such host" error, the OnScraped handlers are not called at all. Colly just stops execution here:

https://github.com/gocolly/colly/blob/master/colly.go#L612

That was the reason why I asked about OnFinally handler.

I think it would be convenient to have an opportunity to make some actions both after successful scrape and after error on response.