microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
26 stars 8 forks source link

Rename branches to eliminate Berkeley commits from `main` #2077

Closed eecavanna closed 2 weeks ago

eecavanna commented 2 weeks ago

Background

On June 17, one of the repo maintainers accidentally merged the berkeley-schema-fy24/main branch into the nmdc-schema/main branch. This happened via commit b2d35ac3.

In an attempt to recover from that without rewriting Git history, @eecavanna and @turbomam created a feature branch off of main, added a "Revert" commit (191af2f329902ea9e70b6bc2ad9d937c09f31055) that reverted that commit (b2d35ac3), then merged that feature branch into main (6bb70bd9aba3ccbf8cc7bb47cc919fb6e1d40e19).

The file tree in nmdc-schema looked good at that point.

However, since the branch's history still contained those 1000+ commits originating in berkeley-schema-fy24/main—albeit now reverted in nmdc-schema/main—so-called "back merge" operations from nmdc-schema to berkeley-schema-fy24 would no longer be straightforward since, from Git's perspective, the most recent operation involving all of the "Berkeley" changes was to revert them (e.g. delete created files, re-create deleted files, etc.).

In an attempt to work around that, @eecavanna and @turbomam renamed the temp-main-backup branch (one of several branches they had created at various points while doing the above things, which they thought did not contain the accidental Berkeley-to-base merge commit) to main. Specifically, they:

As a result, the repository looked like this (older commits not shown):

image

Problem

However, it seems to me that the temp-main-backup branch (now named main) did actually contain the accidental Berkeley-to-base merge commit (https://github.com/microbiomedata/nmdc-schema/commit/b2d35ac3ee35be5525b6fc2b84f9632e40e2758b).

I suspect @turbomam and @eecavanna (that's me) meant to rename the branch named temp-fixing-accidental-merge to main, instead of renaming the branch named temp-main-backup to main. Looking at the branches' current contents, that is, indeed, (conceptually) what I thought I was doing at the time (oops).

Proposal

The branch named temp-main-backup still exists. There is also a branch named copy-of-nmdc-schema-v10.5.4 that has the exact same contents, but a more self-documenting name.

I propose one of the repo maintainers do the following:

  1. Rename the main branch to main-old-2 (or whatever)
  2. Rename the copy-of-nmdc-schema-v10.5.4 branch to main
  3. Apply branch protection rules to the main branch
  4. Designate the main branch as the default branch

Once that's been done, I expect the repo to look like this:

image

Follow-on work

If people happen to have made commits locally based upon the red or purple commits shown in that picture, we would notice that their PR has 1000+ commits in it and @eecavanna (or anyone else comfortable doing so) will work with them to $ git rebase their work so that their PR branch doesn't re-introduce the accidental Berkeley-to-base merge commit (putting us back into the situation we're in now).

eecavanna commented 2 weeks ago

In other words, I think the branch renaming solution would have worked; but I accidentally used the wrong branch as the new main branch. :facepalm:

CC: @aclum

shreddd commented 2 weeks ago

OK - for now - avoid deleting branches until this is fixed, in case something needs to be restored. This seems ok as an overall plan.

turbomam commented 2 weeks ago

The nmdc-schema main branch commit history is now free of the delete operations that would have ruined berkeley-schema-fy24

The berkeley-schema-fy24 GH code page shows that it is ~ 1170 commits ahead and 0 commits behind nmdc-schema

eecavanna commented 2 weeks ago

This is done. Closing.

turbomam commented 2 weeks ago

@eecavanna @shreddd when will you be comfortable deleting the temporary branches?

eecavanna commented 2 weeks ago

I'm already comfortable with them being deleted now, based on the validation we did over Zoom this morning.

I'll be even more comfortable (more comfortable than necessary, from my perspective) at the end of the sprint (i.e. in ~10 calendar days). The hypothetical scenario I think would be simplified for us if we were to wait until then is:

I think if we have copies of those temporary branches around (which we currently do—they are named main-old and main-old-2), it will be easier for us to make sense of that PR.

I like using the end of the sprint as the milestone here because that is a milestone shared by some developers (e.g. they might open a PR for whatever they're currently working on "by then").