When we experience queuing issues with the pipeline agentpools, if the pipeline has made it past the generate matrix stage, then re-running failed jobs works well to push things through the queue. Having a more reliable matrix generation job queue that's independent from the main job queue would help with this.
It's possible that using stateful agents would help with powershell JIT delays incurred by the matrix scripts (which would make the job time go from 20-30s to 1s)
When we have queue backlogs, the number of jobs in the queue can be misleading for auto-scaling purposes, because the un-queued generate matrix jobs appears as one job in queue, when they will actually dynamically generate 5+ more jobs after they run. These new jobs will also get pushed all the way to the back of the queue, breaking ordering for pipelines that should be running sooner.
The generate matrix job takes a very short time to run, so I think the pool size could be very small, especially if we do something like run 100 devops worker agents on a couple VMs or something along those lines.
Enabling this will have a few benefits:
The generate matrix job takes a very short time to run, so I think the pool size could be very small, especially if we do something like run 100 devops worker agents on a couple VMs or something along those lines.