gocolly / colly

Elegant Scraper and Crawler Framework for Golang
https://go-colly.org/
Apache License 2.0
23.21k stars 1.76k forks source link

Utilize Cache Headers #226

Open ColtonProvias opened 6 years ago

ColtonProvias commented 6 years ago

Right now the caching solution from what I can see has no TTL. To avoid over-caching and ignoring server-defined headers, why not make use of https://github.com/gregjones/httpcache?

RC1140 commented 5 years ago

@asciimoo I sorta need this now so want to start working on a PR ,do you have any opposition to me pulling in the http cache lib mentioned in the original request.

I did a few quick tests locally and it looks like it should do the job and allow colly to be a bit more flexible as well since you can change the response plugin cache that matches the interface they provide.