jseldess commented 5 years ago

Jesse Seldess commented:

Our current versioning approach: Keep source files for all versions in master via subdirectories (e.g., v2.1, v19.1, v19.2), use custom Jekyll logic to generate the site structure properly, and rebuild the entire site on each merge to master.

The main benefit is that writers can easily edit multiple versions of a file in a single PR.
The main downside is that, with each new version, master grows in size and complexity and the Jekyll build slows (though this doesn't matter too much for TeamCity builds, and we implemented a [workaround for local builds](https://github.com/cockroachdb/docs/pull/4823]].

The current approach made good sense early in our history, when we were working across doc versions constantly. It makes less sense now, as our team and content has grown and will continue to grow. We should look into a branch/tag-based approach instead.

The main benefit would be a cleaner, more manageable repo. master would house only the very latest version for which we offer documentation, similar to how versioning in the cockroach repo works.
The main downside would be added git complexity when needing to apply a change across versions (separate PR to backport, etc.).

Tentative plan:

[ ] Choosing and designing the new branch/tag-based approach.
[ ) Identifying required changes to our repo.
[ ) Identifying required changes to our build process. There's a chance that we could [leverage Netlify](https://www.netlify.com/docs/continuous-deployment/#branches-deploys] more than we are currently.
[ ] Implementing the changes to our repo.
[ ) Implementing the changes to our build process.

Jira Issue: DOC-284

jseldess commented 5 years ago

Notes from @benesch:

It’d be a decent amount of work, I think You’d basically delete the version plugin, since Jekyll will only be concerned with building one version And then the publish script would do something like for version in v1.0 v2.0 v2.1 master; do git checkout $version && jekyll build --outdir=$version; done But I’m not sure how you would make the version switcher work Right now there’s a lot of smarts about how to map pages that change names across versions And that would all have to go into the publish script somehow

If you can live with a dumber version switcher that occasionally takes you to a page that doesn’t exist in an old version, it’s a pretty quick project But if you want the same level of smarts in that version switcher, it’s a semi-involved project

ianjevans commented 3 years ago

Just putting in some background context here from other formats/publishing pipelines and how they deal with the versioning problem.

In Antora + asciidoc, the site config file includes different sources, and each source has a version. The site file can include both local and remote sources. The sources consist of Antora modules, which can also be either local or remote. Antora consolidates all the sources on build into a single doc set, with a version switcher based on the versions defined in the source config files.

At a previous company, we authored in Dita and used a versioned -dev/-release branch for development. Backporting fixes involved multiple PRs to each affected version. For example, the most current release is 20.2, and a doc fix is needed for 20.2, 20.1, and 19.2. The writer opens a PR to the 20.2-dev branch. After the PR is squashed/merged, the writer would cherry-pick the merge commit to 20.1-dev and 19.2-dev, then open PRs for each version.

At each point release, the -dev branch would merge into -release. High priority doc fixes that needed to be published outside of the point release cycle required additional cherry-picks/PRs to -release. This was eventually automated by tagging the -dev PRs with a "hotfix" label, which would trigger a GitHub Action that would automatically branch, cherry-pick, and open a PR to -release after the -dev PR w/ "hotfix" label was squashed and merged.

As you can see, it was a complicated system, and it became a pain-point as the number of supported versions grew.

ianjevans commented 3 years ago

A better approach would be to use GitHub tags for each version. Each version has a branch, and when the content is ready to be published, we'd push a tag.

You still need to backport changes to the different branches. This could easily be automated with a GitHub Action (or equivalent), similar to what I described above.

The publishing pipeline would then pull in the tags to assemble everything.

One possibility is to use https://github.com/netrics/jekyll-remote-include to include remote .md files from the tag trees.

exalate-issue-sync[bot] commented 1 year ago

Nick Vigilante (nickvigilante) commented: Moving this ticket to be an epic.

After working with Netlify, it seems like a good approach with relatively little overhead to implement would be to set up a monorepo: https://docs.netlify.com/configure-builds/monorepos/

A monorepo keeps everything in the same branch, but we have specific subdirectories hosting each version of the docs. If a writer were to change multiple versions of the docs, we’d run a separate build for each version, which might be fine, depending on the number of concurrent builds we have.

Regardless if we move to branch-based versioning or not, a monorepo seems to me like a good starting point, as a lot of the required work for both a monorepo and a branch-based versioning scheme is shared with little impact on writers' workflow or the display of the final product, with the version switcher being a notable exception. Nikhil is correct in that the version switcher currently has logic it uses to calculate if a doc exists in a prior version of the software. We’d be sacrificing some functionality with the version switcher.

Once we accomplish all the common tasks between a monorepo and branch-based versioning, we can pose it to the team about what they’d prefer in terms of saving time. Would they rather:

Create a single branch and single PR and have Netlify spin up multiple deploy previews automatically while still having everything on one main/master branch, or

Create a branch and orchestrate the backports automatically based on labels similar to the CRDB repo?

A disadvantage of branch-based versioning with automatic backports is that we wouldn’t be able to switch SSGs. Monorepo gives us the ability to switch SSGs very easily. We can’t automatically backport a single change across multiple SSGs, and our docs are in need of a visual refresh.

The thing that would also need to be added would be backport tools for a branch-based versioning scheme.

I’ll add some subtasks.

nickvigilante commented 1 year ago

Closing this to resume work on the Jira epic with the same ticket ID: https://cockroachlabs.atlassian.net/browse/DOC-284

cockroachdb / docs

Consider a move to branch-based versioning #4959

Create a single branch and single PR and have Netlify spin up multiple deploy previews automatically while still having everything on one main/master branch, or

Create a branch and orchestrate the backports automatically based on labels similar to the CRDB repo?