Open 1-bytes opened 2 years ago
Do I need to create a storage and queue for each collector?
I'm not sure if I should do this :(
example:
package main
import (
"log"
"github.com/gocolly/colly"
"github.com/gocolly/colly/queue"
"github.com/gocolly/redisstorage"
)
func main() {
urls := []string{
"http://httpbin.org/",
"http://httpbin.org/ip",
"http://httpbin.org/cookies/set?a=b&c=d",
"http://httpbin.org/cookies",
}
urls2 := []string{
"https://cmd5.org/",
"https://cmd5.org/login.aspx",
}
c := colly.NewCollector()
c2 := c.Clone()
// create the redis storage
storage := &redisstorage.Storage{
Address: "192.168.100.101:6379",
Password: "",
DB: 0,
Prefix: "httpbin_test",
}
storage2 := &redisstorage.Storage{
Address: "192.168.100.101:6379",
Password: "",
DB: 0,
Prefix: "cmd5.org",
}
// add storage to the collector
c.SetStorage(storage)
c2.SetStorage(storage2)
// close redis client
defer storage.Client.Close()
defer storage2.Client.Close()
// create a new request queue with redis storage backend
q, _ := queue.New(3, storage)
q2, _ := queue.New(4, storage2)
c.OnResponse(func(r *colly.Response) {
log.Println("[c]Cookies:", c.Cookies(r.Request.URL.String()))
})
c2.OnResponse(func(r *colly.Response) {
log.Println("[c2]Cookies:", c.Cookies(r.Request.URL.String()))
})
// add URLs to the queue
for _, u := range urls {
q.AddURL(u)
}
for _, u := range urls2 {
q2.AddURL(u)
}
// consume requests
q.Run(c)
q2.Run(c2)
}
I've been thinking if there is a more elegant way to achieve this? Because this results in a lot of duplicate code ...
After I clone a Collector, I'm not sure if I need to use the same storage and queue...
I referenced http://go-colly.org/docs/examples/redis_backend/ and http://go-colly.org/docs/examples/coursera_courses/
regards