Closed gvegayon closed 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
:loudspeaker: Thoughts on this report? Let us know!
@zsusswein and @natemcintosh, this should fix the errors you are getting in other PRs (or reduce them!)
Will get to this today! Got caught up in production yesterday
@gvegayon this is tremendous. Thanks again! I don't have time to do a proper review until next week (I'd want to test and watch Azure Batch Explorer and Github Actions a few times, which I can do), but if all checks are passing and Zach and Nate think it looks good, I'll defer to them.
This is going to make future scheduled work with Azure Batch so much simpler!
Looking beyond the scope of this PR, maybe we could generalize some of these steps. It quite likely that we'll need to do very similar things in the future for other repos, which would probably require a lot of very careful copy and paste. To reduce worry about getting something wrong in that process, maybe we could make a reusable Github action for doing these steps? Something like
- name: Refresh pool
uses: CDCGov/refresh-pool@v1
with: |
arg1: ...
arg2: ...
That said, I've never created an action before, and don't know what the process might look like.
Looking beyond the scope of this PR, maybe we could generalize some of these steps.
I like this idea, but think it should be beyond the scope of this PR. Nate, would you mind making this an issue instead?
EDIT: #72
I also want to flag for @natemcintosh that we should think about how to handle the case of the pool running a stale version of the tagged image. As in:
feat-xyz
cfa-epinow2-pipeline:feat-xyz
cfa-epinow2-pipeline:feat-xyz
(which in the future will be linked to the image from (2) but is not yet pending #59)cfa-epinow2-pipeline:feat-xyz
to the new layersI think this is unanswerable in our current setup (pending #59). So I'm mainly flagging this is a weak spot for us to keep an eye on.
Good question. Once things get merged, I'd say we should try it out by making some change that we can easily check the output of.
@zsusswein, here are some answers to your questions:
- How would this play with Clean up on PR close #62? I still want to leave that functionality for a separate PR but would it be as simple as adding an additional
or
clause to the pool delete condition? If it's harder, are there any changes that can be made here to make implementing it simpler?
I would add that as a separate workflow. I can deal with #62 after this is merged.
- How easy do you think it would be to add polling for pool deletion completion via
az batch pool show
? I think we'd want the runner to hang until deletion is successful so that the runner doesn't report success unless the pool is actually deleted. Feel free to shunt this to a new issue.
I think this could be done, but it sounds like a separate action for CDCgov/cfa-actions! Will add the issue there. Mostly because it will involve writing a program that loops checking whether the pool deletion was complete, and that could be something useful for others.
- Would it be possible to have a bot leave a comment on the PR with the current state of the linked pool?
Not with the current, but with the last run. It could be something similar to what's going on in https://github.com/CDCgov/cfa-actions/pull/1. I suggest creating a separate issue for this.
- Can you add a description of your new functionality and how to trigger pool deletion/recreation to the readme?
Sure. I'll add that to the readme (or a readme).
I was just adding a mermaid diagram, @zsusswein. Here it is:
flowchart LR
START((Start))---DEPS_CACHED
DEPS_CACHED{Deps<br>cached?}---|No|DEPS
DEPS_CACHED---|Yes|IMG
subgraph DEPS[Job01-build_image_dependencies]
direction TB
Dockerfile-dependencies---|Generates|DEPS_IMAGE[Dependencies<br>Image]
end
DEPS---IMG
subgraph IMG[_01_build-model-image]
direction TB
Dockerfile---|Generates|PKG_IMG[Package<br>Image]
end
IMG---POOL
subgraph POOL[_02_create-batch-pool-and-submit-jobs]
direction TB
POOL_EXISTS{Is the pool<br>up?}
POOL_EXISTS---|No|CREATE_POOL[Create the pool]
POOL_EXISTS---|Yes|DELETE_POOL{Commit includes<br>'delete pool'}
DELETE_POOL---END_POOL((End))
CREATE_POOL---END_POOL
end
If you like it, I can PR it
Love it! Please go for the PR.
This pull request includes several enhancements and modifications to the GitHub Actions workflow file
.github/workflows/1_pre-Test-Model-Image-Build.yaml
. The changes focus on improving the build process, adding commit message handling, and enhancing the Azure Batch pool creation logic.Azure Batch pool creation:
Pools can be re-built via commit message by passing the tag[re-build pool]
.[delete pool]
. Will only work if a pool with the branch name was created.Tags[re-build pool]
and[delete pool]
can be combined.Enhancements to build process:
name
attributes to various jobs for better identification in the workflow runs. (jobs.Job01-build_image_dependencies
,jobs._01_build-model-image
,jobs._02_create-batch-pool-and-submit-jobs
) [1] [2] [3]Commit message handling:
commit-msg
output to capture the latest commit message and pass it between jobs. [1] [2] [3]