NOAA-GSL / ExascaleWorkflowSandbox

Other
2 stars 2 forks source link

Consolidate Dockerfiles for ExaWorks constituents #21

Closed christopherwharrop-noaa closed 1 year ago

christopherwharrop-noaa commented 1 year ago

This PR uses Docker build arguments to consolidate the Dockerfiles that differ only in their specification of the operating system in the FROM statement.

We have Dockerfiles for Ubuntu20 and CentOS7 builds of ExaWorks and its constituents. The Dockerfiles for the base environment are significantly different because of the OS package management details. However, the Dockerfiles for the constituent pieces only differ in the initial FROM statement at the top where either an Ubuntu or a CentOS operating system is chosen. This commit removes the duplicate Dockerfiles for the constituent parts and uses build arguments (e.g. ARG) to specify the OS to use at build time. The default OS is Ubuntu20. This eliminates 5 extraneous Dockerfiles and increases maintainability by removing duplication of code.

The CI workflows have been updated to use the appropriate build arguments when building the containers.

venitahagerty commented 1 year ago

I like how you've changed your files so you can consolidate your Dockerfiles. However, if you could set a variable/constant at the top so that you don't repeat the os name, then you'd only have to change it once at the top i.e. when ubuntu20 becomes ubuntu21.

christopherwharrop-noaa commented 1 year ago

@venitahagerty - I'm not 100% sure what you're referring to. Can you be more specific?

christopherwharrop-noaa commented 1 year ago

Are you asking for using a variable in the CI workflow yaml to avoid repetition of the OS?

env:
  OS: ubuntu20

and then:

 build-args: OS=${OS}

Is that what you mean? That would indeed simplify updates to the CI workflow definitions. I can do that.

venitahagerty commented 1 year ago

Yes, that's what I mean

christopherwharrop-noaa commented 1 year ago

Thank you for the suggestion. Much better now.

christopherwharrop-noaa commented 1 year ago

Interesting behavior with the tests. It looks like I may be thrashing the cache. The later container builds are being skipped because they haven't changed. The early ones are waiting to run even though they haven't changed either. My guess is that the cache is too small to hold all of the builds. So my thinking is that the ones which completed first last time got kicked out of cache, and now GHA can't retrieve them so it's redoing them. Bummer! Those take a LONG time (2-3 hours) to run.

christopherwharrop-noaa commented 1 year ago

@venitahagerty - My last commit to address your comments dismissed your approval. Now that the checks have passed can you please take another quick look to see if my last change still looks good?