Closed zsusswein closed 1 month ago
From @jkislin:
Amit said it makes more sense to have a static name (i.e. cfa-epinow2-pool) and just refresh the pool each run. We can talk on this soon. I told him we're working on cleaning up the old pool with a job, which he suggested is ok... but that just having one name for the pool is probably standard/best practice.
I think this is option (1) above? But not totally sure.
My suggestion (following @zsusswein) is to check if the pool exists (one pool per branch). If the pool exists, use something like this to update the container:
az batch pool update \
--pool-id <your_pool_id> \
--virtual-machine-configuration image-reference.virtual-machine-image-id <new_container_image_url>
Cleaning the pool should happen either manually or once the branch is deleted.
az batch pool update
I like this idea, but I can't find such a command in the docs.
I think that my preferred method at this point is option 2. I would see if it currently exists with az batch pool list
, delete it if it does exist with az batch pool delete
, then re-create it with az batch pool create
.
I know this could take a while if there are nodes, but if it's running in CI, then I don't think it's that much of an issue. I know from personal experience that this process can be pretty quick (<10s) assuming there are no nodes currently in the pool.
You probably didn't find it, @natemcintosh, because AI is hallucinating xD. I'll see if there's anything real about it; if so, we can use it! If not, I agree with you that we need to use those three sets of commands. To avoid breaking things, I would add that we should include a label on the commit message to actively (continuously) delete the pool (of the sort of [skip ci]
).
54 sets up a naming scheme that tags images and pools with the PR name (if a PR) and
latest
(if main). This strategy fails if the pool already exists.There are a couple of options here:
cfa-nnh-pipelines
az batch pool show
). We need to make sure any nodes in the pool drain before recreating and this can take some time. (this is my preferred option but open to the previous)cc: @gvegayon