internetarchive / Zeno

State-of-the-art web crawler 🔱
GNU Affero General Public License v3.0
83 stars 11 forks source link

Queue all items from seeds list before starting to crawl #121

Closed CorentinB closed 3 months ago

CorentinB commented 3 months ago

We should queue all items when using get list BEFORE starting the crawl. It would avoid handover blocking the feeding of the queue, and also it would optimize distribution of the seeds to the workers.