Closed ndushay closed 1 year ago
resque-pool allows us to allocate a specific number of workers to specific queues. As I understand it, Sidekiq lets you allocate one pool of workers across all queues, with prioritization for which queue gets worked first when a worker becomes free (but no easy facility to reserve workers for specific queues in the same way as resque-pool). But it'd be great if that's changed in the new version of Sidekiq.
While the resque-pool approach does lead to less efficient utilization of allocated resources (resources are allocated for a peak that's much higher than the average at any moment), it also helps minimize situations like accesisonWF
/preservationIngestWF
backing up because no worker threads are free to pick up validate-moab
jobs because they all happen to be occupied running checksum validation on ~1 TB media objects. That's an unusual situation, but not unheard of, and it'd be nice to avoid it. Similar worries for backup of the replication pipeline behind other work.
We've also arrived at pretty different maximums for the number of workers we want running at a time for some I/O intensive tasks, both because of what seem to be the usual internet and storage bandwidth limits, and because we're happy with lower throughput and less resource competition for e.g. MoabToCatalog
when compared to more urgent tasks like archive zip delivery or validation of versions in accessioning.
Happy to pair, or answer questions in the ticket, or on Slack.
ok, so quick update after reading the link in the description, and doing some web searching and documentation browsing.
i don't see anything at the link (or the release notes in the 7 beta tag) about the upcoming sidekiq version that indicates that worker allocation possibilities will change. but, after a bit of searching, i found:
https://github.com/mperham/sidekiq/issues/1960
Ah, that’s right. Thanks a lot.
I think the Advanced Options - Queues section could be improved to make it obvious that you can start two workers aside each with specific queues, eg:
sidekiq -c 8 -q default sidekiq -c 4 -q critical
https://github.com/mperham/sidekiq/wiki/Advanced-Options#reserved-queues
Reserved Queues
If you'd like to "reserve" a queue so it only handles certain jobs, the easiest way is to run two sidekiq processes, each handling different queues:
sidekiq -q critical # Only handles jobs on the "critical" queue sidekiq -q default -q low -q critical # Handles critical jobs only after checking for other jobs
So, it's possible that Sidekiq supports what we need already. Though we might have to do some capistrano and/or puppet work to pass through configuration? will we basically be writing dlss-sidekiq-pool?
i think the work for this ticket is something like:
preasssembly_image_accessioning_spec.rb
from the integration tests a bunch from a few different laptops simultaneously, and queue some audits, and maybe do some manual accessioning of large objects and objects with lots of files, and make sure things work)i looked briefly for something like a sidekiq-pool gem, and i found: https://github.com/vinted/sidekiq-pool
but i haven't looked closely at it, and have no idea whether it does exactly what we want, whether it'd be easier than doing it ourselves, or how well maintained it is.
in case it's useful for concrete experimentation in this ticket or a follow-on, example of the switch in pre-assembly: https://github.com/sul-dlss/pre-assembly/pull/927
If my understanding is correct, Capsules will provide the necessary functionality: https://github.com/mperham/sidekiq/blob/main/docs/7.0-Upgrade.md#capsules
The reserved queue approach is viable as well. However, this would require a bit of puppet work, as our current puppet sidekiq configuration doesn't provide a hook for this. A more viable alternative is to switch to docker for workers, where this could be easily accommodated in the docker-compose.yml
.
If my understanding is correct, Capsules will provide the necessary functionality: https://github.com/mperham/sidekiq/blob/main/docs/7.0-Upgrade.md#capsules
The reserved queue approach is viable as well. However, this would require a bit of puppet work, as our current puppet sidekiq configuration doesn't provide a hook for this. A more viable alternative is to switch to docker for workers, where this could be easily accommodated in the
docker-compose.yml
.
oh nice, thanks @justinlittman! i hadn't found that release notes doc, the link from the post in the description was broken, and i must've missed it when browsing the repo.
this capsules feature does seem like a very promising approach, and lower effort than either the puppet work for reserved queues or dockerization (though i definitely wouldn't be opposed to dockerization, since it'd have other benefits).
Open further consideration, capsules does not solve the problem. Capsules allows controlling the threads assigned to a queue for a single worker. However, we run multiple workers so the number of threads is multiplied by the number of workers, which doesn't provide the necessary control.
An alternative approach is https://github.com/sul-dlss/operations-tasks/issues/3209, which allows a separate configuration file per worker. This should allow good-enough control using Sidekiq 6.
ACTIONS FOR TICKET:
- understand the worker allocation requirements
- and document them
- investigate upcoming / new sidekiq release to determine if we can meet these requirements
i believe these are all done:
From maint/tech debt storytime discussion Wed 10/19:
As @jmartin-sul has observed, one hard thing is controlling allocation of specific amounts of workers to specific queues. Resque-pool makes that easy, but also makes automatic retries hard (needs a plugin).
John has concern about worker allocation; @justinlittman thinks upcoming version of sidekiq may allow for this.
ACTIONS FOR TICKET: