As of today, workers can select which scrapers they want to run.
But at a central level, we have no control over which scrapers are allowed to run which scrapers, and this could be an issue for various reasons:
we want the task to be executed close to the server / on "approved" machines (e.g. mwoffliner tasks on mwoffliner machines)
the task cannot succeed if not executed on given machines (typically due to IP whitelisting)
So, in addition to the platforms limitation, we probably also need to:
control on a central side which workers can run which scrapers (e.g. mwoffliner should run only on mwoffliner workers) ; maybe at the recipe level ; or with specific tags
control on a central side the affectation of some recipes to some worker(s) (e.g. for Zimit recipes of a Cloudflare protected website, we want the recipe to run on a single worker whose IP has been whitelisted)
It seems pretty important to umplement this ticket , for Wikimedia recipes (not all mwoffliner recipes) because only the mwoffliner workers have a privileged access to Wikimedia API (no quota).
As of today, workers can select which scrapers they want to run.
But at a central level, we have no control over which scrapers are allowed to run which scrapers, and this could be an issue for various reasons:
So, in addition to the platforms limitation, we probably also need to: