sbitxdev / sbitx

1 stars 0 forks source link

Proposal on branching and versioning #4

Open n1ai opened 5 months ago

n1ai commented 5 months ago

I see this effort following a multi-phase development cycle.

Both Phase 1 and Phase 2 will need to have sub-phases, but this level of detail allows me to formulate the rest of this proposal, so let's not get bogged down in lower level details for now.

I propose we follow a branching/versioning strategy similar to gnuradio. I found an older slide here:

image

You can tell it is old since gnuradio is on version 3.10 now and uses 'main' instead of 'master'.

The slide says 'master will become 3.9' but what it means is that the 'main' branch always keeps moving forward, whereas release branches are children of 'main'. Doing it the opposite way, with 'main' being the most recently released code and new stuff in children branches, means big merges from children back to parents.

I think this is a good approach, one followed by most projects because it makes merging easier. The only problematic aspect I see is that when newbies come to a code base they will get 'main' code and not the current release code base and not understand that it is unreleased code. That's just something that has to be communicated.

So, my proposal is:

Feel free to comment. Note I'm not that familiar with the details of what gnuradio does so I may have some details wrong, but IMO it is a good basis for starting the discussion.

If this proposal is accepted, I next propose we do as above and create a maint-3.1 branch in the near future. After that point, we will have places to commit code for the first two 'phases' I describe above, which IMO is very important to have.

Thanks, Dave, N1AI

mikekppp commented 5 months ago

With all due respect, this is fairly complicated. A far simpler approach is that main is always the current release. When work is begun on a new version, a branch is created. All PRs are submitted against that branch (e.g., "3.4"). Each version has a set of of Github Issues created for that version/release/branch in advance of work starting. Once the Github Issues are addressed and all PRs accepted, integration testing is started. Once the acceptance criteria is met (zero P0/P1 bugs), the branch is merged into main, tagged as "Version 3.4", and work proceeds on the next release. Maintenance work is never "backported" to a branch, branches are destroyed once integrated.

n1ai commented 5 months ago

One thing I neglected to mention last night was that not a lot of Git skills are necessary to follow the proposed model.

Gnuradio publishes https://wiki.gnuradio.org/index.php/DevelopingWithGit and that's about the level of knowledge needed.

We'd need to make a web page similar to theirs, and publish a release page so everyone knows branches are currently active.

n1ai commented 5 months ago

With all due respect, this is fairly complicated. A far simpler approach is that main is always the current release. When work is begun on a new version, a branch is created. All PRs are submitted against that branch (e.g., "3.4"). Each version has a set of of Github Issues created for that version/release/branch in advance of work starting. Once the Github Issues are addressed and all PRs accepted, integration testing is started. Once the acceptance criteria is met (zero P0/P1 bugs), the branch is merged into main, tagged as "Version 3.4", and work proceeds on the next release. Maintenance work is never "backported" to a branch, branches are destroyed once integrated.

Your model is similar to mine, if I understand it correctly. The main difference is in your model 'main' only moves forward after a release is merged, in mine 'main' moves forward to contain what will become the next release. Also in your model, branches terminate once merged, but that is problematic in the Linux world. Read on...

The model used by gnuradio includes backports because they are being packaged by all the major Linux distros (Debian and Fedora, plus their down streams) and those releases have the concept of major and minor releases. Major releases can break backward compatibility, minor ones cannot. Major releases are a long time apart, on the order of two years for Debian. Major releases have a long lifespan, typically five years. If your code is in one of the major distros you are expected to be able to deliver bug fixes across that life span. Thus, branches can and do live for five year periods, and backports are used to move critical changes from main into the release branches if/when needed. Obviously the older branches get less attention later in their life, but if someone finds a critical bug you will need a way to deliver a fix. This is different than the mobile app space where you can push fixes and upgrades into people's devices in real time and trigger major OS upgrades on time periods far less than five years.

I realize we aren't in any distros, but if things go along the lines some are suggesting (building a portable radio core) then it's pretty clear we could be, and it is far easier to install this kind of release strategy now while we have a handful of developers than retro-fit it later with many more developers. Once installed, it just becomes a way of doing business.

I realize it is more complex than what you propose, but gnuradio has had 300 contributors since moving off svn and onto github which shows people can and do adapt.

It also de-couples our current lack of an un-chaotic upgrade/downgrade strategy with the fact that we need to deliver point fixes in the near term and also need a place to put PRs that break compatibility also in the near term. Many of the things I see in PRs would break compatibility yet IMO are important changes to take in sooner rather than later. This model offers parallelism that can support both needs. It is true that it is more complex and more work to keep two streams going at once rather than one, but I think the benefits outweigh the costs.

mpapple-swift commented 5 months ago

Looking at the document cited (https://wiki.gnuradio.org/index.php/DevelopingWithGit), this is genuinely archaic in terms of modern Git project management. I also strongly question the idea that we would support distributing packages via apt or even wish to be included in Linux distros. That requires a level of support well beyond anything we can plan on providing even in the long term. Also we briefly discussed the fact that we would like to support iOS, Windows, and potentially Android and various browsers. The approach cited in the gnuradio document is not compatible with deploying on multiple platforms. I'm as they say a pragmatic programmer and focus on working with what a team has and go for solutions that meet basic needs. That's not to say we shouldn't be forward-looking and build infrastructure/foundations that will support our needs going into the future. However, one needs to examine closely what this project intends to provide and support. Supporting previous versions by backporting 'fixed' code isn't how major OSes or most commercial code works. Every possible piece of the product is broken into the smallest possible unit (library either dynamic or static). If you, say, need to fix a security hole in a network library, you fire up the CI pipeline which pulls in all the libraries/modules from Git with that specific tag, only updating to the new tag from the network library. Perform your integration testing and voila, you push the updated release. There's no messing around with anything in Git other than creating a new manifest for the CI system. And aside from whatever the Linux vendors do, Microsoft is the only company that supports fairly old OS versions, and that's only because they make a ton of money from it. Same goes for any commercial software vendor. Apple doesn't, because there's no money in it - they only support one previous OS version and only with security fixes which IMHO is appropriate. You really don't want to manage your SDLC in Git these days, you manage it in your CI/CD system. The system I described is simple and extremely easy to pick up as evidenced by the league of interns we mentored at the companies where I worked. The CI system, however, is something only a few people touch and can be very complex. But that comes back to separation of concerns. Developers interacting with the Git system need only a few simple rules. Developers working on the CI system, CD testing system, and things like GitHub Actions need additional levels of expertise that are unnecessary for casual contributors (or the core team, for that matter).

mpapple-swift commented 5 months ago

To put some specific documentation in hand, I'm proposing a modified GitFlow / Github Flow workflow. Here are some relevant documents describing these various approaches:

Gitflow workflow Git feature branch workflow GitHub flow

Videos The gitflow workflow - in less than 5 mins Getting started with branching workflows, Git Flow and GitHub Flow The Gitflow Release Branch from Start to Finish

Here's a page that describes Git Flow in more detail: The Gitflow Release Branch from Start to Finish

And a Git Flow tool you can use (though I haven't used it myself): git-flow cheatsheet; hosted on Github

As a concrete example, the simplified Github Flow works fine if everyone is working on a single feature/enhancement branch. When you have multiple features/enhancements/etc. ongoing in parallel, the standard Git Flow approach is best (with a develop branch and all as described above). Lastly, we should support the hotfix approach outlined here as well, with the strong caveat that hotfixes are for very serious bugs that slip into the wild; P2 and P3 bugs should be rolled into planned releases to increase the level of testing that is applied to them.

I have started a policies repo with an initial "how we do Git" document. I'll file a PR against it for discussion as the intended vehicle for discussion on our workflow in Git.

And I'll reiterate that while Git informs how we develop code and manage it for archival / storage purposes, the CI system controls how we create releases.

n1ai commented 5 months ago

I thank you for your comments and your proposal. I will definitely evaluate it, but I will also push back a bit on some of your comments, just to help us and the team understand each other better. Please read on.

Looking at the document cited (https://wiki.gnuradio.org/index.php/DevelopingWithGit), this is genuinely archaic in terms of modern Git project management.

It was posted in response to the critique that things need to be simple.

It was posted to illustrate simplicity.

There certainly are more modern ways to achieve the same things.

I see below you describe your proposal as having high complexity, with that complexity being centralized and limited to a few critical-path individuals. In general it is good to centralize the complexity, but it seems to lean heavily into the gods and mere mortals approach which has its flaws as well.

The approach I'm describing is perhaps not as simple for end users but IMO has the benefit of being quite transparent.

As mentioned, over 300 people have contributed to gnuradio since it moved to git. If they can understand the software itself and basic git concepts, it's not much of a hurdle to understand the branch strategy being used. In practice, it's one command to check out a branch, even easier if you are using an IDE that shows you the active branches.

I also strongly question the idea that we would support distributing packages via apt or even wish to be included in Linux distros.

The only platform we now run on is Debian Linux. I think we need to use something like apt that knows the dependencies of the platform we're running on.

We certainly don't need to be in a linux distro right now, but if some suggestions of a broad-based radio core come true, IMO we would want to be in a linux distro then.

Also we briefly discussed the fact that we would like to support iOS, Windows, and potentially Android and various browsers. The approach cited in the gnuradio document is not compatible with deploying on multiple platforms.

Just because you support a native packaging method for one platform doesn't mean you can't support the native packaging methods for other platforms too. I presume you have to if you want your software to get into all the various ecosystems you list. Debian Linux is just one of those, and like it or not, our first one.

I think the current sbitx 'git pull' and local build update strategy is deeply flawed and leads to great dissatisfaction that is being addressed on groups.io. I think we need a packaged binary release in the near future, and to achieve that, .deb format is IMO the obvious choice. Using packaged binaries will remove ambiguity about what version a given user has installed, how it was build, etc.

I also think we need to build complete Pi images in the medium-term so we can move things to more appropriate locations in the file system without the end user base having to do this one user at a time, especially since I also feel it's inevitable we break backward compatibility with older sbitx software.

I'm as they say a pragmatic programmer and focus on working with what a team has and go for solutions that meet basic needs.

I agree. I will go into the approach you propose in detail.

I think it's good we have more than one proposal. I have asked for 'best practices' before and you have provided them. One thing I hope to gain from this project is learning new things.

The downside is that it will take me some time to go through your proposal. I'll try to make it a priority, but I have a lot of priorities right now.

That's not to say we shouldn't be forward-looking and build infrastructure/foundations that will support our needs going into the future. However, one needs to examine closely what this project intends to provide and support.

I agree.

Supporting previous versions by backporting 'fixed' code isn't how major OSes or most commercial code works.

I disagree. Major OS platforms do backport fixes. Linux does. Microsoft does. Cisco IOS certainly does, and it's IMO the best-of-breed commercial product in its space.

Every possible piece of the product is broken into the smallest possible unit (library either dynamic or static). If you, say, need to fix a security hole in a network library, you fire up the CI pipeline which pulls in all the libraries/modules from Git with that specific tag, only updating to the new tag from the network library. Perform your integration testing and voila, you push the updated release. There's no messing around with anything in Git other than creating a new manifest for the CI system.

It seems to me that this approach works because you can choose when to "push" code to end-user devices. I don't see us doing that in the near term. We currently have lots of devices running different OS and sbitx releases. Some (many?) will be upset if/when the device upgrades itself arbitrarily and we'll be responsible for any ensuing breakage if/when that happens. I think it's better in the short term for the user to be aware of upgrades and to publish a downgrade procedure if things break. I think it's better to consolidate the user base on a smaller set of OSes and versions first, then build and test the runtime infrastructure needed to move to a "push model" later.

Feel free to correct me, since it's not clear to me what the expected longevity of the pushed item is. I don't see how to "get there from here". We'll need a process/roadmap to make that happen. Users will have to have far more confidence in the software than they have now for them to be OK with a push model.

And aside from whatever the Linux vendors do, Microsoft is the only company that supports fairly old OS versions, and that's only because they make a ton of money from it. Same goes for any commercial software vendor. Apple doesn't, because there's no money in it - they only support one previous OS version and only with security fixes which IMHO is appropriate.

Every major Linux distro does. Microsoft does. Cisco IOS does.

Today our code base runs on Linux and have dependencies on other Linux packages that need to be factored into when things get released. In theory we could go to something like Snaps that packages all dependencies, but they are widely criticized for creating isolated blobs of out-dated software installed on a given system.

The system I described is simple and extremely easy to pick up as evidenced by the league of interns we mentored at the companies where I worked. The CI system, however, is something only a few people touch and can be very complex. But that comes back to separation of concerns. Developers interacting with the Git system need only a few simple rules. Developers working on the CI system, CD testing system, and things like GitHub Actions need additional levels of expertise that are unnecessary for casual contributors (or the core team, for that matter).

I look forward to learning more and intend to put in the effort to do so, but cannot commit to a time frame other than saying I will start soon.

Till then, I personally still feel the need to have two different code streams is very important so we can issue bug fixes to keep the current sbitx user base satisfied while working in parallel on better stuff that definitely will not be compatible with the current code base, and want to know how your proposed model can satisfy that need.

I could see two different repos, but then the code itself and its change history rapidly diverges. That's why github has the one-and-only-one fork rule, it keeps the change history intact.