Closed Ziinc closed 3 years ago
OK. for (1), will refactor to utilize handle_continue
.
For (3), since the start urls/requests are one-off insertions, I think placing them in the manager is fine. We could add async bulk request insertion to the RequestStorage, if that is what you mean.
Will open up separate issues for (2) once this is merged.
This is a bugfix where the Manager crashes due to timing out on the
init
callback, especially when there is a high number of start requests/urls.This PR implements a split strategy for storing urls/requests using both sync and async methods, by storing the first 1000 requests and firing off a linked task that adds the remaining requests.