AtChem / AtChem2

Atmospheric chemistry box-model for the MCM
MIT License
59 stars 23 forks source link

Support Running AtChem2 in a docker container #518

Closed willdrysdale closed 2 months ago

willdrysdale commented 2 months ago

Hi!

@bsn502 and I have been working on getting AtChem2 running in a Docker (and Singularity) container to make it easier to get up and running with the model. If you are interested, this is something that we could provide to the main repo.

The container comes with the cvode and openlibm dependencies installed (along with gcc-fortran, python3.11 and cmake), and should be ready to run out of the box wherever you run your containers. We have tried it out locally and on UoY's HPC successfully. I've updated the repo's README.md with some information on how to get it running.

Right now the image is built off of release v1.2.2 but I imagine with GitHub actions an image per release + a latest based on the current state of the master branch could be provided.

I've used GHCR to store the image, and as such the docs all relate the the version over on wacl-york/AtChem2 so it can be tested out. We can update this down the line.

Let us know any thoughts and if you'd be happy to include this in the repo

Cheers,

Will

rs028 commented 2 months ago

HI @willdrysdale @bsn502, this looks like a good idea. I confess I am not very familiar with Docker (beyond the general concepts) so I'll have to read about it a bit more before leaving my comments here.

Is the idea to have the user build their own image using the docker file provided or to create pre-built images that one can simply download?

willdrysdale commented 2 months ago

I see it that you would create a pre-built image that others download. It would mean that the user's workflow looks something like:

  1. run docker pull ghcr.io/atchem2/atchem2:1.2.2 (first time setup only)

  2. Setup their constraints in a folder that mimics the atchem2 structure with the inputs they wish to change e.g

    my_model_run
    ├── mcm
    │   └── my_mech.fac
    └── model
    ├── configuration
    ├── constraints
  3. Run the container with docker run -it --rm -v /path/to/my_model_run:/inout ghcr.io/atchem2/atchem2:1.2.2 my_mech.fac The -v flag being the important step where we transfer in those user specified changes into the container. These are then copied to the atchem folder via the entrypoint.sh script - alongside the mechanism to use being passed as a command line argument.

  4. Find their outputs in my_model_run/output

This way you are using dockers benefit of preventing "it works on my machine" issues as much as possible. If the user rebuild's the image there is a chance they change something or install a different version of a dependency. (not to mention the ease of installation).

With some input who knows a bit more about Linux package repos than me, I suspect the Dockerfile could specify versions of the dependencies to make this even more robust.

It is possible using Github actions to have these containers generated automatically like the CI runs, (I don't have much experience setting these up however) - but it should be the case that the image could appear alongside the releases, so one could either install the traditional way from the .tar or go the image and use that (like this on the WACL fork at the moment)

Keen to hear your thoughts when you've had a read :) - the image I've linked about should be working if you want to give it a test.

rs028 commented 2 months ago

I understand how this works, and I can see there are several docker related apps in the github marketplace so I think it should be possible to automatically generate the image, once a new release is created (I probably wouldn't do it for the current master branch, though, as it can be in a state of flux at times). But I suppose one can also generate the image and uploaded it manually, like you have done on the wacl repo.

The documentation should probably go to the manual rather than the README file, but you can leave it there for the time being as I am (very slowly) updating the manual anyway so it is easier if I do it myself.

Any reason the fac file needs to be in the mcm/ directory? At the moment it is recommended to keep the mechanism file in the model/ directory together with all the other settings of a specific user run.

Also, is the inout/ directory only an internal Docker thing?

rs028 commented 2 months ago

You should also add copyright notices to all the new files:

! -----------------------------------------------------------------------------
!
! Copyright (c) 2009 - 2012 Chris Martin, Kasia Boronska, Jenny Young,
! Peter Jimack, Mike Pilling
!
! Copyright (c) 2017 Sam Cox, Roberto Sommariva
!
! Copyright (c) 2024 Will Drysdale, Beth Nelson
!
! This file is part of the AtChem2 software package.
!
! This file is covered by the MIT license which can be found in the file
! LICENSE.md at the top level of the AtChem2 distribution.
!
! -----------------------------------------------------------------------------
willdrysdale commented 2 months ago
docker run -it --rm -v /path/to/my_model_run:/inout ghcr.io/atchem2/atchem2:1.2.2 ./mcm/my_mech.fac

but could equally be:

docker run -it --rm -v /path/to/my_model_run:/inout ghcr.io/atchem2/atchem2:1.2.2 ./model/my_mech.fac

I'll add a commit for this shortly.

rs028 commented 2 months ago

Yeah all right. If you can add some more comments to the shell scripts to explain better what they do, it would be great (see the installation scripts for examples).

I have started a review with some comments and questions, you should be able to reply directly under each one.

willdrysdale commented 2 months ago

Sounds good - where should I find these review questions?

rs028 commented 2 months ago

In the "Files Changed" tab of this PR :)

willdrysdale commented 2 months ago

Sorry - There are no comments showing up for me there, and I am seeing this: image on the PR?

rs028 commented 2 months ago

Try now (my mistake!)

willdrysdale commented 2 months ago

I've finished changing things for the moment, comments have been added to the files and I've streamlined the installation script a fair bit. Now there are no hardcoded versions, as it copies the state of the branch into the container.

The version can be included at build time, for example when I pushed to GHCR I used these:

docker build -t ghcr.io/wacl-york/atchem2:1.2.2 .

docker push ghcr.io/wacl-york/atchem2:1.2.2

You could then update the tag at the end with the release version

rs028 commented 2 months ago

So let me see if I get it.

Once all this is done, the commands : docker build and docker push will execute install.sh and create the image, which is then uploaded manually (for now).

A user would do docker pull to retrieve the image, and then docker run to execute entrypoint.sh and run the model, by mounting a local directory into the image under the name data_transfer (it doesn't matter what it is called on the local machine). Is this right?

willdrysdale commented 2 months ago

That's essentially it!

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 52.05%. Comparing base (c4ed6af) to head (6631d09).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #518 +/- ## ======================================= Coverage 52.05% 52.05% ======================================= Files 17 17 Lines 2096 2096 Branches 166 166 ======================================= Hits 1091 1091 Misses 933 933 Partials 72 72 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

rs028 commented 2 months ago

It seems the CI action is not triggered correctly. Could you modify line 66 of .github/workflows/ci.yml to be checkout@v3?

willdrysdale commented 2 months ago

It seems the CI action is not triggered correctly. Could you modify line 66 of .github/workflows/ci.yml to be checkout@v3?

Can you confirm - line 66 currently reads:

      - uses: actions/checkout@v3

Do you want it changed to:

      - uses: checkout@v3

?

rs028 commented 2 months ago

Apologies, my mistake. Change to :

- uses: actions/checkout@v4

rs028 commented 2 months ago

I think the issue is the version of macos11 which is no longer supported. Can you change line 48 to os: [ubuntu-22.04, macos-12] and uncomment lines 51 and 52, please?

willdrysdale commented 2 months ago

I'm getting some slightly angry linter messages when I uncomment those lines image

Any thoughts?

rs028 commented 2 months ago

Ah, probably should also uncomment line 50 :)

willdrysdale commented 2 months ago

Ahh yes, I just read that as a normal comment for some reason!

willdrysdale commented 2 months ago

Fantastic - more than happy to help when it comes to the manual update.

@bsn502 has some models running now on the York HPC using the container - lets allow them to complete as a final test, should be done by tomorrow.

So long as they work I think we are good to merge. Then do you want to look at storing the image on this repo (manually for now)?

willdrysdale commented 2 months ago

Hi @rs028! We just made one last change to make sure the model outputs are copied out of the the container consistently, even if the model is being re-run.

I think we are happy to merge when you are :)

rs028 commented 2 months ago

Great. I think for the moment we can keep the image on the wacl page, if that is not a problem, while I figure out how to do it here.

willdrysdale commented 2 months ago

Sounds good to me - All the info in the readme currently points there, so everything should work for now - we can update that when the package is added here.