theochem / .github

Guidelines for various activities and initiative with QC-Devs
Creative Commons Attribution Share Alike 4.0 International
7 stars 0 forks source link

Document default repository and continuous integration setup #5

Open tovrstra opened 3 months ago

tovrstra commented 3 months ago

As mentioned in #2, a practical setup guide would bring consistency and helps transferring knowledge on all technicalities of setting up a repo.

The following issue has most of the ingredients, but needs to be transformed into a readable form with more detail: https://github.com/theochem/iodata/issues/313

TODO:

PaulWAyers commented 3 months ago

Should we add a template repository? That, plus a setup guide, could be a good way to handle the issue IMHO.

tovrstra commented 3 months ago

There are advantages and disadvantages to using a template repo. I'm currently inclined not to do it, which may seem surprising. I'll try to explain. Templates seem quite nice but are actually tricky, I think.

The main advantage is that you can quickly set up a new repo with all the bells and whistles. The cookiecutter or GitHub template repositories can make this setup very smooth. (I have some experience with the cookiecutter.)

The main challenge is keeping the template repository up to date and working, which may outweigh the benefit of setting up a new repo quickly. For example, the official cookiecutter-pypackage template is outdated: it uses Travis for CI, no GitHub actions, and no pre-commit. This template has 122 contributors, so I'd expect it to be cutting-edge. Some of this may be a matter of taste, but also the number of open issues and pull requests seems to indicate that maintaining a template repo is not easy.

I think the reason is that CI is needed to keep the template itself in good shape, and there are no established tools for doing CI for a template. (There is a workflow with GitHub Actions for testing the cookiecutter-pypackage template itself, but it seems incomplete and it fails.) I believe the template authors put a lot of effort into this, but it's just not easy to get right.

So I'm not inclined to go down the template rabbit hole, but we should have a practical solution for getting started with a new repo and also updating an existing one (which templates cannot do). Documentation (including pointers to good external documentation) and one or a few working examples seem more feasible. A good and real-life example is easier to maintain than a template. Building a good CI infrastructure for a new repo is also a gradual process. As you add more and more CI components, you have to learn how to use them effectively. My current impression is that this learning aspect is more important than having the files in place.

Anyway, this is just my current perspective. The situation can always change.

PaulWAyers commented 3 months ago

Thanks for the detailed description. I agree, there are some trade-offs. Since right now I think creating a template repo. is a low priority anyway, I think we can say "for now, no" and revisit later.

tovrstra commented 3 months ago

I agree. We can take a gradual approach and still decide to create a template later. Documentation is needed in either case, so starting with documentation seems like a safe move.

FanwangM commented 3 months ago

When a first-time contributor makes contributions to a repo, some may start working on the main branch of the forked repo, which can mess up the git history. We should try to prevent a pull request from being merged from the main branch of the forked repo. It would be beneficial if more instructions should be given in https://github.com/theochem/.github/blob/main/contributing/development_setup.md.

If you think this is needed, I can try to cook up something. @PaulWAyers @tovrstra

tovrstra commented 3 months ago

@FanwangM Thank you for that suggestion. It would be good to briefly mention this in development_setup.md as a heads-up for step 2 in the instructions in contributing/workflow.md. In that step 2, we can add a few bullets with reasons for using a new branch instead of making pull requests from the forked main. The problems I've seen with it are: