galaxyproject / galaxy-helm

Minimal setup required to run Galaxy under Kubernetes
MIT License
41 stars 38 forks source link

RFC: Branch organization and repo rename #42

Closed afgane closed 3 years ago

afgane commented 5 years ago

I'd like to propose a branch (re)organization on this repo to the following:

In addition, I'd like to suggest we rename this repo to galaxy-helm to make it more indicative of the actual repo content.

afgane commented 5 years ago

@pcm32 Any chance you could take a look at the branches on this repo and see if any can be cleaned up?

pcm32 commented 5 years ago

Sure, will take a look... I think there is one feature branch that is active and the other ones are merged... will check asap.

nuwang commented 4 years ago

ping @pcm32 Let us know which branches you want to keep and we'll delete all the rest.

pcm32 commented 4 years ago

I would prefer that we go back to a git-flow scheme again, where we keep tags on the master branch and we develop on develop, which is also the standard practice on many repos (as in the main Galaxy repo, at least the develop part). I would like us also to make more minor releases and actually stick to them (so not over-writing a tar.gz in an archive repo). I need us to do this so that I can leave some setup pointing to fixed versions. Normal users can see the latest release on the master branch and more daring users can go bleeding edge on develop.

We can have the CI index and tar.gz merges that we do to master (only releases) with their tag as version. I will also add my CI from another repo that actually spins up the whole chart for testing on minikube. If you want to have version branches, they could stem from master whenever needed (but I'm more of the idea of releasing often with minor versions, in the end the packaged charts are light-weight).

afgane commented 4 years ago

The versioning of the chart is absolutely something we should adopt. Thus far, there was no real reason because there had been no initial release and it was only the dev version that existed, constantly churning to put in WIP and bug fixes. In addition, it was a time constraint of setting up the versioning. Given the GVL5 release from Friday, we've reached a milestone and there is new work starting around CI, which will also require versioning be better handled.

Re. the branches, I think we should stick with the master branch being the current tip and having release branches for major release. It is the same setup we've now established for all repos relevant to the GVL and CloudVE so having this one be different makes is undesirable. It is also the default when creating new repos on Github and hence a widely adopted practice.

nuwang commented 4 years ago

Yes, we need to tag and actually release stable versions regularly. The lack of proper CI testing makes this difficult, so once CI tests are in, we should be able to do this frequently. For now, I've tagged a new release to coincide with GVL 5.0.0.

Agree with sticking with a single master branch and keeping things simple. We can tag stable versions periodically and create named branches off those tags if required, in particular for major releases.

pcm32 commented 4 years ago

Well, what can I say, I don't like having the same branch for stable, develop and the releases... but, I don't have the time to keep arguing about this. I would hope at least that we protect master from direct commits and that we use squash and merge for a cleaner history (this is how I was doing it I think before you guys took over).

pcm32 commented 4 years ago

Guys, we haven't had releases for around a month now, and some PRs in the waiting. We also need to become compatible with the default k8s that will start to rollout on GCP (see #130 ). I can take a more active role (doing releases and setting a CD for that), but I would like to move back to an scheme where master is only for releases to simplify the CD setup. Since releases have been done manually and left a bit orphaned here for a while, I would rather take control of that again.

nuwang commented 4 years ago

As I mentioned earlier, I don't think that follows the norms of most other projects. Usually, the trunk merely contains the aggregate of ongoing development PRs - with tests passing if all goes well. If you want a stable branch, the norm would be to create a fork, or a branch that's "stable" by whatever definition of stable you have.

What I would propose for your CD setup is a custom branch that is hooked up to the CD system. That way, you can pull from trunk and update your stable branch at whatever cadence that suits your setup. I don't think we can really maintain a "stable" branch that suits everyone, because that would mean a project that does only bug fixes and doesn't evolve. What if there's a breaking change? When a breaking change occurs, it'll be up to you to make the necessary changes to your setup, so by definition, we can't do that. The logical thing would be for periodic "releases", which others can use. If you want an ongoing "stable" that never breaks for you, that will need to be maintained by you in a way that suits your setup, periodically pulling from master at a cadence that suits.

almahmoud commented 4 years ago

In terms of CD, I've been working on using Github workflows for automatic version management and packaging.

First round works as of yesterday: When a PR is merged, version is bumped, then chart is packaged and pushed to helm-charts. Version bump is controlled by labels in the PR: [patch_bump, minor_bump, major_bump] for now, where version is major.minor.patch. Eg: https://github.com/almahmoud/galaxy-helm/pull/47 after it got merged triggered https://github.com/almahmoud/galaxy-helm/runs/547651679?check_suite_focus=true which pushed https://github.com/almahmoud/galaxy-helm/commit/d9f3c9715e6e59c4f73d01427d2243589006b1ac and https://github.com/almahmoud/helm-charts/commit/1549461cf8932b5fa2895a1de870f312a7657b72. I think when we add this, we will have a much easier time merging PRs that will always result in a new or minor version change, so old versions remain stable, and we have continuous releases. I do agree with nuwang that master shouldn't be "stable", so if one wants to point to a non-changing version, they should pin it down or use the release branch, not master. I think master should be an edge version of the chart, so it'd be maintained with good versioning, but is very much allowed to take breaking changes, as it's meant to be the newest best version of the chart, not one that is always backwards compatible

pcm32 commented 4 years ago

I don't think that follows the norms of most other projects

which projects do you mean here, and how is that relevant? For what is worth, my proposal is to essentially use git flow, which is probably the most common git branching pattern among existing branching patterns. We do need multiple production versions, and gitflow gives you that.

the norm would be to create a fork, or a branch that's "stable" by whatever definition of stable you have

This stable and trunk speak sounds to me a lot like CSV and subversion, whose paradigms are quite old and left behind.

I don't think we can really maintain a "stable" branch that suits everyone, because that would mean a project that does only bug fixes and doesn't evolve

I disagree, that is why you use a master branch for releases, where you are safer, and a develop branch for bleeding edge changes that are merged by features. That is how thousands of git flow repos (and similar branching patterns) operate, so I doubt that we cannot do the same.

If you want an ongoing "stable" that never breaks for you, that will need to be maintained by you in a way that suits your setup, periodically pulling from master at a cadence that suits.

which is exactly what periodic releases do by merging develop to master, at a desired cadence as you put it.

Any way, if Alex has setup a CD and packages are pinned somewhere, then it would be fine for me at this point. I don't like the idea of having a release per PR, but I think it is better than the current situation where there haven't been any releases for a while.

ksuderman commented 3 years ago

The only task remaining to close this issue is to remove stale/dead branches. This sounds like a good paper cut, but if anyone has a dead branch out there now would be a good time to delete it!

RE: CI/CD - while I am not working on that directly, I have been developing some scripts and playbooks to bootstrap clusters from zero to working Galaxy as I familiarize myself with Kubernetes on the various cloud platforms. They could be a start for a CI/CD system and I will push them somewhere to GitHub.

Also, I am a little late to the discussion, but I am in the pro git-flow camp, or at least, the pro consistent naming convention camp. Whatever process is chosen ideally all repositories share the same process. Since it is unlikely to convince galaxyproject/galaxy to change we should likely go with something similar to what they are using [1]:

  1. https://github.com/galaxyproject/galaxy/blob/dev/CONTRIBUTING.md
almahmoud commented 3 years ago

I'm going to close this issue as I think it's old and we've had many conversation about this and don't have the bandwidth to waste on this. Also, are there any practical real-life problems that this solves? I think we already addressed both original issues of having releases be packaged more often and not overwriting pinned versions, and I think we've come to a decent solution after last GCC that accomplishes those goals while also reducing/eliminating the burden of managing repositories.

Just for the record before I close the issue, i'll try to summarize what I can remember of the conversations we've had before making the decision:

1) We will not have release branches, nor track this repository in relation to the galaxy repo. In theory someone can launch the new Galaxy image on the old chart, or an old Galaxy image on the new chart. Linking chart version to Galaxy version is a lot of unnecessary overhead, and this chart is supposed to be a deployment mechanism for any Galaxy, not tied to a specific Galaxy image. It is also in general common practice for charts to not be versioned the same as the app they deploy as the app will likely change much more often than the chart once the latter becomes stable. With that in mind release branches are a useless overhead that doesn't really help us much. Also, if we were to talk about a conceptually similar repository, I'd say we are close to https://github.com/galaxyproject/ansible-galaxy not Galaxy core repo, cause comparing deployment with deployment not deployment with app. 2) Feature/bug branch naming conventions seem purely aesthetic. We want to encourage people to use forks, and for the few of us who do have write access, the branches are usually just an ephemeral convenience thing. If it's really that big of a deal i'd rather just delete all branches and revoke write access to the repo from all of us than start having to make a conscious decision for naming branches. We don't want to encourage anyone to look at branches or use them anyway, and I couldn't care less what people are naming their branches on their forks, so why does this even matter? I understand the aesthetics bring value to big projects where organization matters because there are hundreds of people collaborating, but for this repository we have 2-4 people, so if we spend more time with management overhead than actual development we're left with nobody to develop the actual chart (which is only a part of the stack that the very few of us are maintaining) 3) I originally had proposed we maintain a stable branch that we consciously sync master to, and treat master as an edge repository, but have been convinced that the ideal scenario is that the master branch itself should be the latest stable as why would we ever merge something into master that should not go to stable? We've discussed whether the idea of having versions be incrementally bumped with every change is better, or have actual "release" versions tagged that we consciously package when we decide. We used to have a model similar to the latter, but moved to the former to avoid the problem where we have to dedicate time to deal with "official" releases, and decided to go down a more Continuous Delivery model where we continuously package and make accessible every new version as each change is merged into master. This is not unlike many other charts. The main criteria that was asked by Pablo if I remember correctly was that packaged versions are never updated and backwards incompatible changes are clearly marked, so the bumping model seemed the best way to nullify our conscious effort needed to manage the repository and versioning, while also allowing people to remain on their pinned versions with a guarantee that the version will not be changed from under them.

Overall, I don't think this is a good use of anyone's time, but if there is actually a problem that needs to be fixed regarding the versioning model, I'd say let's discuss it in a new issue as this one is outdated and we don't want to conflate things.