NWChemEx / .github

GitHub Settings for the NWChemEx Organization
https://nwchemex.github.io/.github/
Apache License 2.0
1 stars 2 forks source link

CI/CD Use Cases/Considerations #105

Open ryanmrichard opened 1 year ago

ryanmrichard commented 1 year ago

Managing the CI for NWChemEx is a tall order. To afford the maintainers the largest amount of creativity possible I propose we put together a list of use cases/considerations. Any solution which addresses these use cases is fine.

Architecture

PR CI (think merging into a dev branch):

Release Deployment (think merging dev to master):

ryanmrichard commented 1 year ago

@hjjvandam, @twindus do you recall anything else form our CI meetings I missed?

ryanmrichard commented 1 year ago

@zachcran you've been thinking about CI/CD for the CMakePP org, so I'll tag you too.

hjjvandam commented 1 year ago

This looks like a complete list to me.

twindus commented 1 year ago

One piece that I am not sure is represented here is pulling different branches/tagged version of the dependent software. I think it is in the "Test multiple versions of dependencies", but the sub-bullet seems to imply something different.

For the dev documentation builds, do they get deployed somewhere so that the web versions can be checked? I know that can be done locally, but don't know exactly how that would/should be done in CI.

Another question is related to this line: "A failed dev to master pipeline should revert dev and reopen the original PR with the new issues". I am assuming that this means that a merge into dev would trigger trying to merge to master? Or is the merge to master done "by hand"? I believe it has to be the latter. Here I am thinking of the case where you have multiple repos that need to be changed and there will be PRs in different repos. You have to kick them off in some order. Merging to dev in each repo shouldn't be an issue (assuming everything else is OK). But I don't really know how the merge to master will work. If you do the integration tests when you do the first merge to master in one of the repos, this is probably going to fail due to problems associated with not merging in all of the dev branches at once. I am probably missing something here.

Those were the only things that came to mind.

ryanmrichard commented 1 year ago

One piece that I am not sure is represented here is pulling different branches/tagged version of the dependent software. I think it is in the "Test multiple versions of dependencies", but the sub-bullet seems to imply something different.

I added "Want to pin/follow specific versions." to the architecture part.

For the dev documentation builds, do they get deployed somewhere so that the web versions can be checked? I know that can be done locally, but don't know exactly how that would/should be done in CI.

I think in CI the best you get is ensuring the HTML is generated. If the generated HTML contains an error I'm not sure how you'd find that without requiring a human to manually read the documentation (which they should have done in the PR, although it admittedly may render differently). I have the "Releases should trigger deploying documentation" bullet for deploying, but that refers to the production deploymen.

Another question is related to this line: "A failed dev to master pipeline should revert dev and reopen the original PR with the new issues". I am assuming that this means that a merge into dev would trigger trying to merge to master? Or is the merge to master done "by hand"? I believe it has to be the latter. Here I am thinking of the case where you have multiple repos that need to be changed and there will be PRs in different repos. You have to kick them off in some order. Merging to dev in each repo shouldn't be an issue (assuming everything else is OK). But I don't really know how the merge to master will work. If you do the integration tests when you do the first merge to master in one of the repos, this is probably going to fail due to problems associated with not merging in all of the dev branches at once. I am probably missing something here.

I added "PRs may be dependent (changes needed in multiple repos)" to both the PR to dev and dev to master sections. I think we want automatic merging to master. My expectation is that once we're at 1.0 status it should be rare for a dev to master merge to fail. IIRC GitHub added PR queues (or something like that) for ensuring an order; I haven't looked into it too much. Otherwise we could have a flag to request manual merging or something when order matters.

yzhang-23 commented 1 year ago

@ryanmrichard I could be naive, but could you explain a little bit more on the dependency diamond issue? I'm not clear what problems it may cause in a CI process.

ryanmrichard commented 1 year ago

image

Is an example of a dependency diamond in our stack. Namely SimDE depends on ParallelZone through both PluginPlay and Chemist. CMake handles a lot of the problem for you in that CMake will makes sure that SimDE only sees one version of ParallelZone and that the version SimDE sees is compatible with PluginPlay, Chemist, and SimDE.

I don't know what happens when images enter into the equation, in particular is SimDEgoing to see two copies of ParallelZone (one from each base image)? Maybe it'll "all just work", but that's not clear to me.

yzhang-23 commented 1 year ago

image

Is an example of a dependency diamond in our stack. Namely SimDE depends on ParallelZone through both PluginPlay and Chemist. CMake handles a lot of the problem for you in that CMake will makes sure that SimDE only sees one version of ParallelZone and that the version SimDE sees is compatible with PluginPlay, Chemist, and SimDE.

I don't know what happens when images enter into the equation, in particular is SimDEgoing to see two copies of ParallelZone (one from each base image)? Maybe it'll "all just work", but that's not clear to me.

In my current design, the image to build a repo is built in the following steps:

  1. Pull the base building image, e. g., the one for building Parallele, as a start point;
  2. Add other packages which are necessary but not present in the base image;
  3. Pull the release images of dependent repos and copy everything in their /install directories to the current /install directory.

In your example, the building image of SimDe will pull the release images of PluginPlay and Chemist, which both have a library of ParallelZone under their /install directories. However, since the ParallelZone libraries are identical, I would expect that one just overwrite the other, and thus nothing we need to worry about (to be tested)

zachcran commented 1 year ago

However, since the ParallelZone libraries are identical, I would expect that one just overwrite the other, and thus nothing we need to worry about (to be tested)

I believe CMake handles this by just keeping the first one that is found and using it for all of them. There is also an extra safeguard built into CMaize install files that stops the imported target from being created if it already exists, so hopefully "it just works"(TM).

ryanmrichard commented 1 year ago

image Is this what you're doing?

yzhang-23 commented 1 year ago

image Is this what you're doing?

Yes, that's what I'm doing.