webrecorder / browsertrix

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
https://webrecorder.net/browsertrix
GNU Affero General Public License v3.0
201 stars 35 forks source link

[Feature]: Create job "channels" with separate and different numbers af harvesterinstances #2024

Open tuehlarsen opened 3 months ago

tuehlarsen commented 3 months ago

What change would you like to see?

We need different job "channels" to hook our jobs up opon with pools of harvesterinstanses to avoid that some daily deeep jobs (100GB) are hanging on waiting on capacity for long time. That's is what happens if you schedule (via API) many jobs (> 10000 jobs) with only 4 harvesters.

Context

see above