The original goal of this design was to support a form of code sharing between related projects, especially since we don't support forks for security reasons.
Anecdotally, there are two forms that branch-as-workspace usage seems to take:
1) starting from a known point to pursue a new line of enquiry.
this is rarely merged back into the source branch, basically making it a fork. Most of our users are not experienced with merging changes back into main or similar workflows.
wanting a clean workspace to experiment with alternate approaches w/o accidentally overwriting previously generated files
preserving historic changes in an easy to access manner (accessing files at previous revisions is not an obvious UX for many users).
2) some projects do want to share common code.
users do not seem to usually use merging for this, instead c/p, again possibly lack of git familiarity
have largely been directed into extracting into reusable actions instead
There are also some explicit drawbacks to allowing branches for workspaces.
a) For private repos, all branches will be public when the repo is made public. To comply with OS policy, we need to make private repos public 12 months after the first job on any branch/workspace ran, but a workspace based on a recent branch of that repo might not be ready to be made public yet.
b) It adds platform complexity. It makes it harder to understand the system, and makes browsing code in the right branch on github a bit more complex and easy to get wrong. If we only allowed a single repo with a main branch, then a bunch of things would we easier to implement and understand.
Suggestions
1) We should probably not allow new workspaces to use branches of private repos - they should be new repos. This will stop future problems when it comes to making those repos public.
2) We should consider not supporting branch based workflows going forward, and requiring new repos.
This probably needs more discussion and discovery, and may not be the right call
Would need improved documentation and tooling for "forking" existing repos (e.g. opensafely clone?) or making it easier to create reusable code in other ways.
We'd need to either migrate existing branch based workspaces or provide b/w compat support for branch based workspaces.
We currently support creating a workspace that points to a branch on a repo.
Currently, 60 repos have more than 1 workspace linked to them:
The original goal of this design was to support a form of code sharing between related projects, especially since we don't support forks for security reasons.
Anecdotally, there are two forms that branch-as-workspace usage seems to take:
1) starting from a known point to pursue a new line of enquiry.
2) some projects do want to share common code.
There are also some explicit drawbacks to allowing branches for workspaces.
a) For private repos, all branches will be public when the repo is made public. To comply with OS policy, we need to make private repos public 12 months after the first job on any branch/workspace ran, but a workspace based on a recent branch of that repo might not be ready to be made public yet.
b) It adds platform complexity. It makes it harder to understand the system, and makes browsing code in the right branch on github a bit more complex and easy to get wrong. If we only allowed a single repo with a
main
branch, then a bunch of things would we easier to implement and understand.Suggestions
1) We should probably not allow new workspaces to use branches of private repos - they should be new repos. This will stop future problems when it comes to making those repos public.
2) We should consider not supporting branch based workflows going forward, and requiring new repos.
opensafely clone
?) or making it easier to create reusable code in other ways.