[x] If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
[x] New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.
I am working on a plugin for AWS Batch. Each API call to launch a Batch job takes about a second, and we need to get the job ID from the HTTP response. If launches are synchronous, that means it could take over 15 minutes to launch 1000 workers. And unfortunately, array jobs are never going to be a good fit for crew. I expect this sort of problem to be ubiquitous among cloud plugins.
Launches should be asynchronous. Fortunately, this is a much smaller problem than crew is trying to solve in the first place because:
Tasks are short, with roughly equal execution times.
Tasks all run locally.
Auto-scaling is not necessary.
This allows us to use mirai directly with local daemons and the passive dispatcher. The launcher can launch and terminate local daemons using the start() and terminate() launcher methods (need to move worker shutdowns to a shutdown() method.) There can also be a condition variable which each daemon can signal if there is a launch error. launch() can check this condition variable for errors. I am not sure how much of this can be in base crew and how much needs to be in each specific plugin.
Prework
Proposal
I am working on a plugin for AWS Batch. Each API call to launch a Batch job takes about a second, and we need to get the job ID from the HTTP response. If launches are synchronous, that means it could take over 15 minutes to launch 1000 workers. And unfortunately, array jobs are never going to be a good fit for
crew
. I expect this sort of problem to be ubiquitous among cloud plugins.Launches should be asynchronous. Fortunately, this is a much smaller problem than
crew
is trying to solve in the first place because:This allows us to use
mirai
directly with local daemons and the passive dispatcher. The launcher can launch and terminate local daemons using thestart()
andterminate()
launcher methods (need to move worker shutdowns to ashutdown()
method.) There can also be a condition variable which each daemon can signal if there is a launch error.launch()
can check this condition variable for errors. I am not sure how much of this can be in basecrew
and how much needs to be in each specific plugin.