Monorepo migration checklist

philbooth commented 5 years ago

@dannycoates has been playing around to see what it would look like if we move to a monorepo for all of the FxA code, and it looks great! But there's a bunch of obstacles to get past before it's ready for real life, so this issue is for discussion of those.

[x] Sort out CI, e.g. pushing content server changes should only run content-server CI tasks. But in some cases we would want some cross-coverage though, e.g. pushing auth server changes should exercise tests in the js client and (a subset of) the content server. Also, Travis and Circle, do we keep both or can we use just Circle, or...?
[x] Train cutting scripts. We can't have all the changelogs littered with commit messages from peer directories, so we need to make sure they're bound to just their own directory's changes.
[x] L10n. I have no idea how l10n works, but obviously we can't break it. Maybe that will be fine, but someone who knows that stuff needs to weigh in.
[x] Private repos. The customs server is always deployed from a private repo, and other services occasionally are. We'll need a private monorepo so that we can pursue the same strategies going forwards.
[x] Project management
[x] Does anyone mind us doing this? We should check with the rest of the FxA team, Application Services, Ops, SecOps, etc and so on.
[x] Rename and archive this repo, so that monorepo remote can be mozilla/fxa.
[ ] Make sure all pre-commit hooks are working okay, e.g. auth server email template versioning hook.
[x] Ensure fxa-dev works as expected.
[x] Ensure latest.dev.lcip.org updates so that we do not lose teamcity. (I think this should happen if the preceding happens?)
[x] Old tags. Ensure all of the tags from old repos are available in prefixed form in the monorepo. This is a pre-requisite for sanely tagging releases from the monorepo. I've seen issues where the train 133 tags are missing in the monorepo, for instance.

/cc @shane-tomlinson @vbudhram @lmorchard @ianb @clouserw @jrgm @jbuck

clouserw commented 5 years ago

/cc @jvehent @rbillings @flodolo

[ ] What happens to the issues filed on the existing repositories? (both open and closed)

rfk commented 5 years ago

The customs server is always deployed from a private repo

We could stop doing that if we figured out how to move the private pieces into config, about which I won't say any more in this public bug.

FWIW Mozilla occasionally seems to run out of private repos on its paid github plan, so there's some small value in consolidating all the FxA stuff into a single one.

flodolo commented 5 years ago

Regarding L10N, right now we only see one repository from Pontoon https://github.com/mozilla/fxa-content-server-l10n

I am not familiar with the automation regarding import and export from that repository.

Pontoon can be set up to commit to a different repository (we might need to drop and configure the project from scratch, cc @mathjazz for visibility), commit to master or a branch. Commits to master are going to pollute your history quite a bit.

jvehent commented 5 years ago

The customs server is always deployed from a private repo

We're also talking about replacing it with https://github.com/mozilla-services/foxsec-pipeline

philbooth commented 5 years ago

Just noting that I've ticked off the first checkbox, about CI, because @dannycoates got that stuff working pretty well already. See dannycoates/fxa-mono / dannycoates/682d0395.

The train-cutting scripts still make a pig's ear out of the changelogs as far as I can see, so I'm going to try and fix that aspect tomorrow.

@shane-tomlinson, if you're able to confirm that our l10n scripting will still work okay from the monorepo, I think it would leave us in a position where we can take a decision in the next Monday meeting whether we want to start Q2 in a fresh monorepo utopia, or if we need to keep our janky old fragmented workspace around for whatever reason.

dannycoates commented 5 years ago

I added fxa-content-server-l10n as a submodule and ran extract_strings.sh from there without issues. Whether we use a submodule or just manage it with scripts ala fxa-local-dev is an open question, but I think keeping it as a separate repo still makes sense.

philbooth commented 5 years ago

Whether we use a submodule or just manage it with scripts ala fxa-local-dev is an open question, but I think keeping it as a separate repo still makes sense.

Yeah, I'm definitely not a fan of submodules fwiw. If submodules are the answer, we're asking the wrong question.

shane-tomlinson commented 5 years ago

The customs server is always deployed from a private repo

We're also talking about replacing it with https://github.com/mozilla-services/foxsec-pipeline

Customs is going to live for the immediate future, so we will need a solution for deployments. We are going to need an fxa-mono-private for security fixes, so we can deploy customs from the -private repo using the same process.

Train cutting scripts. We can't have all the changelogs littered with commit messages from peer directories, so we need to make sure they're bound to just their own directory's changes.

This seems like a tractable problem, can we just call grunt conventionalChangelog from within each subdirectory after bumping the version in package.json and say we are good?

L10n. I have no idea how l10n works, but obviously we can't break it. Maybe that will be fine, but someone who knows that stuff needs to weigh in.

Let's keep l10n in its own repo. The string extraction script makes no assumptions about the directories in which the auth-server and content-server are located, we'll just need to update the Lambda function to pull a single repo and extract from the appropriate subdirectories.

What happens to the issues filed on the existing repositories? (both open and closed)

We could migrate all issues into a single repository and use labels or projects to manage. The worry is that it'll become more difficult to filter if there are 500 open issues in a single repo. OTOH, we won't have to search across 5 repos when we have that "I know I filed this bug somewhere!" moment.

philbooth commented 5 years ago

This seems like a tractable problem, can we just call grunt conventionalChangelog from within each subdirectory after bumping the version in package.json and say we are good?

No, because it includes commit messages from the entire repo and ruins the changelogs, as I mentioned above. I'm going to work on fixing this later.

clouserw commented 5 years ago

What happens to the issues filed on the existing repositories? (both open and closed)

We could migrate all issues into a single repository and use labels or projects to manage. The worry is that it'll become more difficult to filter if there are 500 open issues in a single repo. OTOH, we won't have to search across 5 repos when we have that "I know I filed this bug somewhere!" moment.

That just sounds like we need to do our triage and be realistic about closing issues that will never be implemented. I'm not too worried about sorting through the issues.

One thing that I hadn't really considered is how on-the-clock we are for this. Whatever tool we choose to replace waffle (and waffle itself, for that matter) is not going to manage to track metadata on issues that are migrated, so we're going to lose all the points at a minimum. The "columns" should survive since they are labels in github.

We are filing the issues for subscription services this week which means we need to get this sorted so we can start using a tool to track point estimates. (Next Monday would be ok, but we can't go much further and sooner would be better).

dannycoates commented 5 years ago

No, because it includes commit messages from the entire repo and ruins the changelogs, as I mentioned above. I'm going to work on fixing this later.

logging commits by directory is easy with git git log -- fxa-auth-server but I haven't figured out conventionalChangelog. There wasn't anything obvious to me in their readme.

philbooth commented 5 years ago

...I haven't figured out conventionalChangelog. There wasn't anything obvious to me in their readme.

Fwiw the approach I'm considering is to ditch conventionalChangelog and replace it with our own scripting. That's something we had success with in the email service, so it only seems like a short leap from there to extend it to something that is more broadly applicable and better tuned to our requirements.

shane-tomlinson commented 5 years ago

Fwiw the approach I'm considering is to ditch conventionalChangelog and replace it with our own scripting.

I'd love to get rid of the additional dependencies and use something we control.

philbooth commented 5 years ago

@dannycoates, two more repos to add to your monorepo shell script:

shane-tomlinson commented 5 years ago

@dannycoates, two more repos to add to your monorepo shell script:

mozilla/fxa-email-event-proxy mozilla/fxa-geodb

It might be nice to organize the subdirectories in such a way where common repos/services are in the top level dir, and ancillary repos such as 123done are in a combined subdir so as to not pollute the top directory with so many directories.

philbooth commented 5 years ago

I've added a new checkbox to the opening comment:

[ ] Rename and archive this repo, so that monorepo remote can be mozilla/fxa.

philbooth commented 5 years ago

Regarding the private repo, I've thought it through and think we need to fork the private monorepo from the public monorepo, then manually in a single commit apply changes from the private customs server repo.

This will ensure the private monorepo can be used as an alternative remote, with the same commit hashes in its tree as the public version. That's a requirement for us to keep working the same way with any new private monorepo.

Assuming we all agree on that approach, I'm ticking off the private repo checkbox in the opening comment.

philbooth commented 5 years ago

I have the beginnings of a working train-cutting script here, but there are some caveats:

It won't work for the first release that we cut in the monorepo, because it uses the last tag in history and assumes that all subdirectories were tagged with that tag. But I volunteer to do the manual massaging for that first tag, so that subsequent tags can use a shared script.
The pattern it matches for the changelog insertion point is slightly to different than will be currently found in our current changelogs, because they're no longer being generated by conventionalChangelog. Again, I volunteer to do the small amount of tweaking necessary in our first monorepo train cut so that subsequent trains will run smoothly.
I haven't implemented the customs server private repo shenanigans yet, because I wanted to make sure it's going in the right direction before spending time on that.

I can go into those in more detail in this afternoon's meeting, just wanted to note the broad position here for completeness.

philbooth commented 5 years ago

Added to the opening comment:

[ ] Make sure all pre-commit hooks are working okay, e.g. auth server email template versioning hook.

philbooth commented 5 years ago

I have the beginnings of a working train-cutting script here...

I've added another script to that gist, which can be used to cut the first train in the monorepo. It does a bit of extra work to pull out the individual tags from each sub-directory, so that it can pull the correct commits.

Neither script does the customs server stuff yet, I plan to wrap that up tomorrow.

philbooth commented 5 years ago

@dannycoates, are we certain that all of the old tags are being copied across by the monorepo script?

Reason I ask is that the new train-cutting script relies on those tags to work out which commits to add to the changelog. I list them like so:

git tag --list --sort=-v:refname | grep "^fxa-auth-server-"

The most recent one listed when I do that is fxa-auth-server-v1.132.1, so I'm not seeing any tags for train 133. Do you know why that is? The end result is that I'm adding commits from train 133 to the changelog for train 134...

EDIT: It also means I'm not updating the version strings in package.json and npm-shrinkwrap.json, because sed is matching against the wrong pattern for the old version number.

philbooth commented 5 years ago

I've updated the train-cutting scripts to merge and tag the private repo (which doesn't exist yet) too:

https://gist.github.com/philbooth/94a1f5ec37d9983a17adae1a38e13acd

That means they're pretty much ready to go I think, I encourage anyone interested to give them a try. Note that they fail right now because we haven't got a private repo to fetch changes from, but as soon as that exists they should run to completion cleanly. In the meantime you can comment out that chunk of code, towards the end of each script.

If you want to run them, note that the first tag in the monorepo must be generated with first-release.sh. That script takes one argument, which is a version string e.g. 1.134.0.

Subsequent tags should be generated with release.sh. That script takes one optional argument, patch. If patch is specified it bumps the patch number from the last tag, or if not the train number is bumped. The nice thing about using the last tag is you can checkout an old train branch and it will bump that patch number correctly, and it will also insert the relevant commits to the correct point in the changelog rather than blanket-inserting them at the top of the document.

Any comments or feedback, let me know. Once the monorepo is in place, I'll of course add release.sh with a pull request, so that it can go through our usual code review process.

shane-tomlinson commented 5 years ago

I'd also like to add these to the list:

Ensure fxa-dev works as expected
Ensure latest.dev.lcip.org updates so that we do not lose teamcity (I think this should happen if the first happens?)

philbooth commented 5 years ago

Closing this. We didn't fix every single problem yet, but there's separate issues open for everything so we don't need this one any more.

mozilla / fxa

Monorepo migration checklist #354