Closed kmike closed 6 years ago
This PR is going to introduce a major change https://github.com/scrapinghub/frontera/pull/331, where backends will not be responsible for prioritisation anymore and this responsibility be transferred to crawling strategy. CS would have to schedule the request with score ranging from 0.0 to 1.0. Likely currently only HBaseBackend and RedisBackend are supporting this. The higher the score the bigger the priority of request.
Sounds great! I'm closing this issue then.
In the backends documentation it is explained how prioritization work for
For revisiting backend it is said "no prioritization" - what does it mean? Are seeds scheduled FIFO? Are requests scheduled for recrawling processed in FIFO order as well?
For HBaseBackend prioritization is not explained - is it FIFO, or something similar; maybe partitions affect it somehow, etc.?