Open don-han opened 8 years ago
Just for future reference, we could use the priority
kwarg to take advantage of the inherent PQ that scrapy has built-in for requests. Here was what I posted Slack:
[2:17]
...since higher priority values correspond to, well, higher priority, just
take the difference between max_depth and the depth of the current
page and pass that in as the priority. We take the difference because
we want higher priority to correspond to lower depth, effecting a bfs
by page-depth. I don't remember if this is the case, but we'd have to
enqueue all domains first though, so that it doesn't start bfs... on one
domain.
Suggested by @alvinwan:
priority http://doc.scrapy.org/en/latest/topics/request-response.html