martinvonz / jj

A Git-compatible VCS that is both simple and powerful
https://martinvonz.github.io/jj/
Apache License 2.0
8.36k stars 287 forks source link

FR: Add `--insert-before` and `--insert-after` to `split`, `new`, `duplicate`, etc. #3772

Open thoughtpolice opened 4 months ago

thoughtpolice commented 4 months ago

The --insert-before (AKA --before) and --insert-after (AKA --after) arguments are recent, useful additions to the jj rebase command that allow you to move revisions to any given part of the graph, with the desired edges.

However, these flags could generally apply to almost any command where we move or insert a commit in the graph. For example:

These are the main ones I can think of that don't have this functionality, roughly in order of actual real-world importance.

Justification

While we can currently achieve all these with an extra rebase command, I think this is a much nicer and more consistent UX for users, especially ones who already use rebase with the --before and --after flags. These are very common wants people tend to have; "do a thing, then move it", and I think when you do this enough, you tend to want to do it often. So I think some reasonable shorthands for compound actions like this are valuable and reduce friction for new and experienced users alike.

It's true that this means certain verbs are now capable of moving nodes, whereas jj rebase was the main one to do it before. But I think that "only one way to move things" is more of a nice-to-have rather than a goal, in my mind. I also think the question shouldn't be "how many commands can do something another command does" but rather, what features work nicely together, what's the cost of maintaining them, do they conceptually fit, do they have a more general form, etc. I think these additions are both conceptually simple and also "fit" well.

More broadly, at a much higher level — my Mega Merge Strategy has (seemingly, somehow) become popular amongst some people, which has changed some of the basic workflows some users have come to enjoy and rely on; Benjamin liked this strategy enough that it lead him to implement --before and --after in the first place. This is a harmonious example of evolving requirements: how people use the tool is evolving in an unanticipated way, and so the tool in turn must evolve in kind to match new demands. (This is a good example of what I might call "give and take" that sometimes occurs in FOSS, in my mind.)

Finally, we have a fairly limited vocabulary of actions that actually do create or insert new nodes, so it's not like the surface area of this request is infinite or ongoing. I don't believe these features would introduce large technical or maintenance burdens; they seem to all dovetail rather nicely with each other and I don't think we have major plans to expand them, though some might arise.

Alternatives

Do nothing, and leave them as is. All of these examples can be achieved by doing the given command, then using jj rebase to achieve the desired effect.

martinvonz commented 4 months ago

By the way, jj rebase has --destination in addition to --insert-before/--insert-after. So we could also have jj split --destination etc.

Also, maybe that flag would be more consistently called --onto or similar, since it's really a third mode for "where" to put the commits. My goal is to clarify that -d, -B, and -A are mutually exclusive. Just a thought, not sure it's a good one.

PhilipMetzger commented 4 months ago

As my point is represented here (the single command thing, which I brought up yesterday with this message in Discord).

I just want to clarify my stance a bit by replying on the giving commands new capabilities part, see below.

It's true that this means certain verbs are now capable of moving nodes, whereas jj rebase was the main one to do it before. But I think that "only one way to move things" is more of a nice-to-have rather than a goal, in my mind. I also think the question shouldn't be "how many commands can do something another command does" but rather, what features work nicely together, what's the cost of maintaining them, do they conceptually fit, do they have a more general form, etc.

My point for having a single verb is to mostly avoid the wonderful CLI Git has and having fewer concepts simpler for new users. But as long as Jujutsu avoids the command flag bloat from Git, carefully expanding the capabilities of commands should be fine.

bnjmnt4n commented 4 months ago

I think it makes sense to add --destination as well, if we are going to consider adding --insert-after and --insert-before.

My goal is to clarify that -d, -B, and -A are mutually exclusive.

At least in jj rebase, -d is only mutually exclusive to -B/-A, but -B and -A can be used together.

jj new actually has -B and -A options, but they behave slightly differently from jj rebase. In particular, you can't combine -B and -A.

There's also an existing feature request for jj duplicate -d: https://github.com/martinvonz/jj/issues/3518.

Apart from the mentioned commands (new, split, duplicate), jj backout is also a good candidate for -B/-A, as mentioned in https://github.com/martinvonz/jj/issues/2802. In particular, perhaps a better default for the destination of the new commit would be --before @ instead of --destination @.

arxanas commented 4 months ago

"before" and "after" are fundamentally clunky. They're a two-argument form to describe a single logical value: "where" to create a commit. This is a limitation in the current jj data model that we shouldn't carry forward into the ultimate interface.

The problem is that the revset language supports identifying an existing commit, but not the imaginary places/"holes" where you would like to place a commit. Suppose that, in addition to RevisionSet as a valid type, there were also a RevisionLocation type, which identifies where to insert the commit without actually evaluating to a set of commits. Then you could write

jj rebase -s 'Y' -d 'location(after=X, before=Z)'

where

The semantics are that jj rebase and friends would assume that -d/--destination

without inventing new flags that need to be copied to all commands.


There are several other meaningful variations on placing commits that we could unify if we adopted an abstraction over commit placement:

And actually some even more general attributes about "how" to create the commit:


One intermediate step towards the above would be to remove the --before/--after flags and instead only expose replacing existing commits as the alternative to rebasing as children. Then we need only create a revset function that evaluates to a new commit as part of the operation, just like location as above, but without inventing a new type:

# create a new commit as part of evaluation inserted between `after` and `before`;
# doesn't need to be committed to the op log, necessarily, as the subsequent operations should create "real" commits
def new(after: revset = empty(), before: revset = empty()): ...

Then we can replace existing usages of --before/--after with something like

jj rebase --source Y --replace 'new(X, Z)'

without having to invent/implement an entirely new abstraction.

joyously commented 4 months ago

Do you really need two revisions for placement? That seems like a setup for confusion. What if the two aren't adjacent? Or am I misreading the intent, like using them in sequence with the output of the command that has multiple (split, duplicate)?

arxanas commented 4 months ago

Do you really need two revisions for placement? That seems like a setup for confusion.

It's a good question. In git-branchless, we actually only have the --insert flag, which always only means "insert before all existing children of the destination", which is enough for the majority of workflows, but not all. I would hope to see jj adopt a greater degree of generality, if possible.

Here's a case-wise analysis of workflows for rebasing a set of commits, in descending order of relative frequency. (Note that in jj, the empty "after" set is actually root() rather than truly empty.)

after before use-case
non‑emptyempty - This matches the common `--destination` workflow for `jj rebase`, etc.
non‑emptynon‑empty - When inserting commits between other commits. - Note that if commit `X` has immediate children `Y1` and `Y2`, and you want to insert a commit `Z` in between `X` and `Y1`, then you need to specify both `after = X` and `before = Y1` to get the exact topology that you want. - This is something that jj can do with its `--insert-before` and `--insert-after` flags. - This is something that git-branchless's `--insert` can't do. It would always set `before = children(X) = Y1 + Y2`.
emptyempty - Rare. This can happen if you're truncating repository history. Did this at work recently in Git. - Of course, it also incidentally happens when making the initial commit, but you don't explicitly specify it in those cases.
emptynon‑empty - Rare, corresponding to adding new roots to an existing commit graph - Did this at work recently in Git to add a common ancestor between two versions of the truncated-history repositories mentioned above, so that I could use Git's three-way merge tooling. But, uh... not common.

It's true that the empty "after" cases are rare in comparison to the non-empty "after" cases, so we should certainly design to accommodate that common case, but there are definitely power-user workflows that use the empty "after" cases (meee 😭).

This analysis is also why I defined the order of the parameters to the hypothetical new function in my previous post to be after then before, so that the meaning of new(X) is new(after=X, before=empty()), which is the most common case, as per this analysis.


What if the two aren't adjacent?

This only applies to the non-empty/non-empty case above, which is used for inserting commits. I'll give my thoughts on the common workflows with respect to adjacency.

There's also situation where you want to add a new edge between existing nodes in the commit graph. But jj does not operate on edges; it only operates on nodes. The way to express that in jj is to rebase a commit onto its existing parents, and then additionally include a new parent commit to create the new edge (and then rebase descendants as normal, etc.)


The "revset location" idea can handle all of these workflows, which is why I think it's a nice design, or at least some version of it as an underlying primitive.

I didn't mention it for brevity, but if we were to implement the full proposal above, I would expect several contextual defaults for the various configurable axes. Examples:

arxanas commented 4 months ago

I forgot to mention "mega-merge" explicitly in my workflow discussion above! That's the same thing as I was describing with adding parent edges and annotating dependencies locally in a way that's not supported by most other tooling.

bnjmnt4n commented 4 months ago

Thanks for the suggestion @arxanas! Both suggestions (RevisionLocation and --replace) seem intriguing.

It seems to me as though the RevisionLocation would be easier to implement (we could make all current --destination arguments (or similar) accept RevisionLocation or revset instead of just revsets, whilst other commands like jj log will still only accept revsets).

There are several other meaningful variations on placing commits that we could unify if we adopted an abstraction over commit placement

It seems as though these aren't really directly related to the location, so it might be confusing if these were passed as arguments/parameters along with the parents/child. I definitely agree that we ought to think of a way to express these attributes (adding all of these as arguments to each command has the same drawbacks of adding --before/--after).

For the new(X, Y) function, it seems a little weird to create new commits from within a revset. I will probably need to think about that one. --replace seems like it could be useful though.

arxanas commented 4 months ago

It seems as though these aren't really directly related to the location

The problem might only be the choice of the term "location". We want to describe "the manner in which commits are created and inserted into the commit graph", which encompasses location as well as topology and the other things.

For the new(X, Y) function, it seems a little weird to create new commits from within a revset.

There is the general issue that jj commands are not composable: it's hard to use the output of one command as the input to another. I've mentioned the weird side-effecting DSL approach before; I opened https://github.com/martinvonz/jj/issues/3814 to discuss it directly.

ilyagr commented 1 month ago

This bug is getting slightly unwieldy IMO. Focusing mostly on the title:

We could rename this bug or split it. We also don't have to change anything, but even if we don't, I hope my summary will be helpful for people to navigate related issues.