Consider developing more in branches

jessegreenberg commented 2 months ago

We would like to consider working on feature branches instead of doing all development on the main branch. We are planning on a brainstorm discussion for this. This issue is to document thoughts about what this process change would entail.

To get started, here are some benefits of working in feature branches:

More efficient workflows - Deferring git hooks and testing until the merge means less time spent every commit.
A more stable main branch for PhET and POSE developers.
Higher code quality - Only reviewed and tested changes make it to the main branch. Half-finished changes do not get pushed to the main branch or make it to production.
Freedom to experiment - less stress about breaking CT and developer workflow for intensive changes.

Here are some potential drawbacks of working in feature branches:

More time resolving merge conflicts if merges are infrequent.
It is not clear how this will work for PhET's polyrepo setup? You would need a branch in each every changed repo for each feature.

Here are some questions that need to be answered.

What are other benefits/drawbacks can people think of?
What tooling can we try out to support PhET's polyrepo setup?
How can we ensure that feature branches are kept up to date with the main branch? What is the process and who is responsible for that?
What does the PhET team need before we "commit" to a branching strategy? A documented process?
Would CT need to test feature branches? Do we need to support QA/design team testing changes on feature branches?
What needs to be decided before someone can start experimenting with developing in branches?
Should we consider a new versioning strategy for our common code repos? That way sim repos can stay on older versions of dependencies while common code repos can continue to progress. (This sounds complicated, I am just brainstorming).
What are best practices for developing in feature branches? Review resources.

zepumph commented 2 months ago

@samreid and I are very excited for our discussion tomorrow! Here are some notes from a discussion we had today about branches. One central question we had was "What would it look like if we wanted to move Buoyancy development to a branch today."

Some more discussion questions:

Feature branches for each individual, or one per sim, and then a single checkout for most common code.
"main is stable" now, vs. "main is just as unstable, but it always passes git hooks"
What if we have a new branch for all repos called "unstable". Then we do everything we current do off of that. Maybe sim branches still branch off of main though?
Noting that there will always will be conflicts and disruptions between different projects, but the goal is that we get to decide when to handle them (not if).
Use of git cherry -v was very helpful for releasing Projectile Data Lab in RC. git cherry -v 1.0 main. MK uses Webstorm (git->branches->repo->branch->"Compare with OTHER_BRANCH")
Policy adjustment about the overhead behind branches (github issues to match). Instead use a different convention, or improve tooling to handle this. Ideas: single issue for all branches, name the branch for a github issue, special branch name schema ("feature/upgrade-typescript-5.4/chipper-1432"), BRANCH_README.md committed in the branch, etc.

Notes on how to do testing:

CT should always test main.
CT should always test unstable.
CT should be able to test arbitrary feature/sim branches.
What are the cCode quality standards for feature/sim branches. Is it just the wild west?
Can a sim branch run CT locally to ensure confidence?

brent-phet commented 2 months ago

FYI - I am interested in the github issue <-> branch relation. I've used it in the past to understand the status of an issue by following the link to the branch and comparing it to main (in our case). This is also helpful for compiling release notes depending on the relation between issues/branches and releases.

Policy adjustment about the overhead behind branches (github issues to match). Instead use a different convention, or improve tooling to handle this. Ideas: single issue for all branches, name the branch for a github issue, special branch name schema ("feature/upgrade-typescript-5.4/chipper-1432"), BRANCH_README.md committed in the branch, etc.

brent-phet commented 2 months ago

Also a question: since we are discussing feature branches here, does that mean discussion on other branches (hotfix/bug/qa/release) would be follow-on conversations?

marlitas commented 2 months ago

I did some googling as @pixelzoom recommended in slack and found some really useful resources.

Microsoft DevOps
Trunk Based Development
Gitflow
I read a discussion forum that mentioned something about IntelliJ's branch management. Haven't been able to look too deeply into it, but here's the documentation for reference.

marlitas commented 1 month ago

5/7 Meeting Summary:

The meeting focused on discussing branching strategies for development within the PhET project. The team brainstormed ideas, identified concerns, and planned for subteams to start using branches in some form. Here are the key points discussed:

Github flow vs. Trunk:

Github flow involves feature branches, pull requests, release, and merge into main, which might be more useful for products using Continuous Integration.
Trunk involves feature branches, pull requests, merge into main, release, with the goal of reducing development commit headaches.

Benefits of Branches:

Branches were seen as a solution to reducing the risk of committing code that breaks others' work or spending excessive time on precommit hooks.
Sim development could benefit from branches, simplifying code management.
Branches could improve collaboration, code review, and testing processes, leading to more stable main branches.

Concerns and Questions:

Questions arose about how multiple developers working on branches simultaneously would affect merges and collaboration.
Concerns were raised about the potential overhead of managing numerous branches across repositories.

Proposed Solutions and Ideas:

Suggestions included using automated tooling for branch announcements and implementing an unstable branch for testing changes before merging into main.
Discussions touched on versioning common code and potential complexities in maintaining unified changes across repositories.

Action Items:

Subteams were recommended to start branching soon, with discussions on branch naming conventions planned for a future meeting.
Infrastructure considerations for testing off main branches were noted, with a suggestion to open an issue for branch management.

Overall, the team emphasized the importance of keeping branches short-lived and discussed transitioning from commit-based concerns to merge-based concerns in development processes. Further meetings were planned to address specific concerns and finalize branching strategies.

samreid commented 1 month ago

Some notes to jog my memory for Thursday's conversation:

Prefer short-lived branches

On Wednesday I asked @marlitas about long vs short-lived branches. @marlitas referred to this section https://trunkbaseddevelopment.com/youre-doing-it-wrong/#duration-of-short-lived-feature-branches which says:

The short-lived feature branch should only last a day or two and never diverge from the trunk enough so that a merge back is problematic. After the seal of approval from code reviewers and CI daemons, it should be merged back into the trunk. It should be deleted, as proof of convergence. The developer in question may then go ahead and make the next short-lived feature branch for the next story/task they’re doing.

Thanks for helping me understand that better, that makes perfect sense.

This also overlaps with a proposal @zepumph described (if I understood correctly), where there would be a long-running "development" branch. If I understand correctly, that proposal shares commonalities with the Gitflow workflow described here, where most work is done in the "develop" branch and merged to main at checkpoints: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow. However, that section is guarded with the caveat:

Gitflow is a legacy Git workflow that was originally a disruptive and novel strategy for managing Git branches. Gitflow has fallen in popularity in favor of trunk-based workflows, which are now considered best practices for modern continuous software development and DevOps practices. Gitflow also can be challenging to use with CI/CD. This post details Gitflow for historical purposes.

Therefore, it seems that the feature branches proposal described in https://www.atlassian.com/git/tutorials/comparing-workflows/feature-branch-workflow is the most promising.

Main is stable

The main theme of our discussion has been to increase the stability of the main branch, by pushing higher-quality less-frequent changes to main. I believe this is related to the problem we worked on in https://github.com/phetsims/tasks/issues/1106. CT being green is not a sufficient condition to main being stable, but it is a necessary condition. I don't know of any of our tests that are overly sensitive (like an assertion or failure that is "ok"). Our team is kind of using main as a way to invoke testing on CT. Like "things are working well in my working copy, let's see what CT says about it". But in order to do that, we have to push to main, thus destabilizing main. Maybe one way to describe the instability of main right now is: if common code has received one push that hasn't yielded a solid green column, main is unstable. So we will need a way to better test things before they get to main. Maybe the gitflow "develop" branch was supposed to be one way of doing that? But I think we should look for other ways.

samreid commented 1 month ago

I used a short-lived feature branch for the development of https://github.com/phetsims/density-buoyancy-common/issues/121 and it went very well. A brief summary of my process:

create the branch. Notify my subteam. I used the branch name focus-order/density-buoyancy-common/121
commit + push to the remote branch as I go. Type check, lint, test as needed. Pushed 12 commits over the course of a few hours. No worries about failing CT, CTQ, or disrupting another simulation, or disrupting others working in the same simulation.
run the precommit hooks once
merge to main + push
delete the branch (locally and remote)

I could have also left the changes in the branch for the code review. I was also surprised there was no evidence of the deleted branch afterwards.

This was a straightforward process and worked well for this "hello world" ultra-simplistic use case.

samreid commented 1 month ago

Brainstorming one more "stepping stone" idea before we work out our long-term solution. We could develop a notion of "unstable" on CT. For instance, if code has been pushed to main that has not yet yielded a solid green column, then we would consider main "unstable". In that case, team members would be warned to "pull at your own risk". Once new code has been pushed to CT and has yielded a green column or two, then we could consider it pullable. Of course it would be nice to supplement this with other testing (more thorough than precommit hooks) so code can already be stable by the time it lands on main. Just an idea.

zepumph commented 1 month ago

Last week we decided to try a branch naming schema like: feature-name/issue-repo/issue-number.

Today @samreid @AgustinVallejo and I found that our currently decided on branch naming (with path slashes) looks awkward in the git branches dialog of Webstorm. It creates directory structure that results in more key presses. And it results in feeling like you are checking out a branch named as just the issue number.

Instead we recommend changing this to feature-name_issue-repo_issue-number

pixelzoom commented 1 month ago

Instead we recommend changing this to feature-name_issue-repo_issue-number

That general form hurts my brain.

It's almost the same thing, but how about featureName_issueRepo_issueNumber. Require camelcase for featureName, and ban separators (especially '_') in featureName. Bonus point for verifying that issueNumber is actually a positive integer.

For example:

stackTraceLimit/density-buoyancy-common/121

samreid commented 1 month ago

The recommendation above has inconsistencies and it is unclear what is meant. "how about" has "_" but the example has "/". The how about has issueRepo as camel case but in the example, it is hyphenated.

pixelzoom commented 1 month ago

Sorry, I mean stackTraceLimit_density-buoyancy-common_121 for the example. Guess I've been looking at too many "conventions for feature branch names", all of which recommend using "/" by the way.

issueRepo is a placeholder, not implying camelcase.

phetsims / phet-info

Consider developing more in branches #222

Prefer short-lived branches

Main is stable