geoschem / GCHP

The "superproject" wrapper repository for GCHP, the high-performance instance of the GEOS-Chem chemical-transport model.
https://gchp.readthedocs.io
Other
23 stars 26 forks source link

[DISCUSSION] How to submit a pull request with submodules #231

Closed lizziel closed 2 years ago

lizziel commented 2 years ago

Discussed in https://github.com/geoschem/GCHP/discussions/141

Originally posted by **TerribleNews** December 7, 2020 I believe the GCHP adjoint code is close to a state where I can submit a pull request. However, I have had to fork the geos-chem, MAPL, FVdycore, and HEMCO submodule repositories and make changes to them in addition to the GCHP code. Is that multiple pull requests then? How can they be coordinated? More generally, is there a guide about coding and testing requirements before submitting the request?
lizziel commented 2 years ago

We are now using GitHub Issues rather than GitHub Discussions.

lizziel commented 2 years ago

History of discussion (since I will delete that discussion to prevent further use of it):

@yantosca:

Hi @TerribleNews, thanks for writing. I've transferred the issue to the GCHP repo since this is more relevant to GCHP. My sense is that if you have new code for FvDycore & MAPL, then you should submit pull requests to GMAO, so that they can ingest those updates. Ditto for HEMCO. @LiamBindle, @msulprizio: any thoughts?

@LiamBindle:

So far what Lizzie and I have done for PRs to GMAO that affect multiple repos is create a "main PR" and link to the associated PRs. This procedure is far from seamless though—I'd be interested if anyone has ideas for improving it.

In this case, I would make 4 PRs. I would make the "main PR" the one to geoschem/geos-chem and link to the other three there. The "main PR" would coordinate the PRs, and it would describe how the reviewer should merge them.

The tricky part is what do do about the submodule updates. The incoming PR will have changes to the git submodules, and the reviewer will have to undo these, and point them back to the GEOS-Chem forks/repos. We could request that PRs don't include modifications to the .git/submodules—I think this is probably a good idea, but I haven't thought about it too much.

@TerribleNews:

Oops, sorry. I just clicked the link on the GCHP wiki and didn't notice it went to the geos-chem repo.

I am really hoping not to have to go through GMAO. Trying to keep up with the source management changes has been a lot of overhead on this project and I'd really really like to get the code into the GCST's repos so that I don't spend several days each month trying to merge main back in, usually for no code-related reason but because of structural changes in the git repo, like disappearing branches.

I forget how fine-grained the merge process is. Can the reviewer simply ignore them? Or should I create a new commit to the GCHP repo that undoes those changes and submit that as my main PR?

@LiamBindle:

In terms of whether the PRs should be made to GEOS-Chem or GMAO, I think it depends on whether you want to feed the update back to GMAO. If it's GCHP specific, then PRs to GEOS-Chem are good. If it's a more general update—something that affects MAPL in a general sense—then I think it makes sense for the PRs to go to GMAO.

From the reviewers perspective, it's probably easiest if you revert your commits that modified the .git/submodules, but if one were to forget to do this, it's easy for the reviewer to do. I think it isn't a big deal either way, but I think we should encourage people to make this revert before submitting the PR.

I think this would be a good GitHub feature request (multiple PRs spanning multiple submodules). I'll start thinking about drafting a request for this (unless someone else would like to, in which case I'd be happy to collab on the draft)—I suspect other ESM orgs would be interested in this too.

@lizziel:

Sorry all, I am just reading this issue thread for the first time now. Figuring this out is still a work-in-progress but here's what seems to be working best so far for bringing in updates to the super-projects (GCHP and GEOS-Chem Classic both).

PRs should be done on the individual submodules with information in the conversation about what each one goes with, with links to the other submodule PRs.

Only make a PR on the super-project repo if there are file changes, i.e. not just submodule commit changes. The conversation should include a complete list of all submodule PRs that go with the update, with links to them. Do not include submodule hash updates in the super-project PR.

If the only super-project changes are submodules, i.e. no files changed in the GCHP repository itself beyond submodule hashes, then do not make a PR. We can handle coordinating the submodule updates by looking at the individual submodule PRs. You can make an issue, however, in the form of a feature request, to unify information on all the updates. This can serve as an announcement of the upcoming update to all users watching the GCHP repository (including GCST). Include a list of PRs that go with it.

Generally updates to GMAO libraries should go through GMAO but this is a gray area. I think it should be handled on a case by case basis for now. Who interfaces with GMAO on an update may come down to scope of changes and project funding.

@LiamBindle:

That sounds like a good strategy to me @lizziel. Thanks for summarizing this!