Open hooper opened 5 months ago
We fix files in X.., instead of rebasing them,
Anecdotally, Google has had many users of hg fix over many years, and they don't ask for the rebasing behavior. They do occasionally ask for something like this FR.
I don't really understand the FR as written. Is it not essentially proposing rebasing the descendant commits instead of fixing them? Can you provide a specification of the behavior not in terms of the current implementation?
Rebasing is potentially faster than some code formatters, but not enough to justify exposing users to unnecessary conflicts
Rebasing instead of fixing or reparenting descendants becomes important/useful if the fixes are expensive to compute (just opened https://github.com/martinvonz/jj/issues/3808 in relation). It's annoying when I have to wait for cargo clippy --fix
to fix the same lint issues in all of the commits repeatedly.
IMO, We should always reparent instead of rebasing (to avoid conflicts), but we should do so while solving the problem that @arxanas mentioned.
def fix(revisions):
fixes_required = {rev: changed_files(rev) for rev in revisions}
for rev in revisions:
new_files = rev.files
for file in changed_files(rev):
if file in fixes_required:
file.set_content(run_fix_command(file.path))
else:
# There was originally no change, but the fix command changed our parent
file.set_content(rev.parent.files[file.path])
reparent_children(rev)
Suppose that commit A
changed files foo
and bar
, and commit B
(a child of A
) changed file foo
, and commit C
(a child of A
) is empty. If the user runs jj fix -r 'mutable()::B'
, and their fix command was fix
, then we should:
fix
on foo
and bar
in A
, producing A'
B
and C
onto A'
B
foo
and bar
have been changed, but that only foo
is in fixes_requiredfix
on foo
jj restore --from A' --to B bar
because it isn't in fixes_requiredC
is now non-empty, containing the un-fixed foo
and bar
.[...] but we should do so while solving the problem that @arxanas mentioned. [...]
- Determine that foo and bar have been changed, but that only foo is in fixes_required
- Run fix on foo
Unfortunately, this approach doesn't solve the problem I have. It supposes that a fix can be run on individual files, but (as far as I can tell) cargo clippy --fix
can't run on individual files. Build tools are particularly likely to be 1) expensive and 2) unable to run on individual files. See also https://github.com/martinvonz/jj/issues/3808.
The file-level-granularity "redoing" approach you suggest makes sense for many other classes of formatter-style fix. It's also similar to an approach I recall at a previous company where people did automatic codebase-wide refactorings and opted to drop changes in conflicting files altogether, although the redoing would be done manually at a later time.
Why is there a fix
command? It seems beyond the scope of a version control system.
@joyously It needs to operate on and modify the commit graph. How could it exist without the version control system? It might be best to start a separate discussion regarding the suitability/scope of the fix
command, if you have something in mind.
This discussion seems to be about code formatting. That would be a normal change in files, not anything to do with the commit graph.
It’s for applying tools like formatters to a range of commits in batch.
Unfortunately, this approach doesn't solve the problem I have. It supposes that a fix can be run on individual files, but (as far as I can tell)
cargo clippy --fix
can't run on individual files. Build tools are particularly likely to be 1) expensive and 2) unable to run on individual files. See also #3808.The file-level-granularity "redoing" approach you suggest makes sense for many other classes of formatter-style fix. It's also similar to an approach I recall at a previous company where people did automatic codebase-wide refactorings and opted to drop changes in conflicting files altogether, although the redoing would be done manually at a later time.
By design, fix is not suitable for all use cases - like you said, fix is designed to run formatters which take only a single file as input. Your use case is reasonable, but it should be solved by jj run
, so I don't see a problem with it not being solved by jj fix
.
I still don't see why a VCS would have a command that is for affecting the content in batch when that changes history. If it's for doing a batch chmod
or resetting the commit date or author or branch(?), that sort of makes sense, but not for doing what is basically a file operation on commits (although I know I mentioned chmod
...).
@joyously I'm not sure what the problem is with changing history, though maybe I'm misunderstanding your point. Literally everything injj
changes history, by design, and I don't see any issues with changing history if you're only changing mutable commits (which is the only thing you can do by default).
jj squash
changes history for all children of the commit you amendjj squash my_file --to <commit id>
changes history of some arbitrary commit and its descendantsMy use case is:
jj commit
jj commit
jj fix -r mutable()..B
, then jj git push --change B
If I were to not use jj fix
, I'd instead do:
jj edit A
<list files in A> | xargs clang-format
jj edit B
-> possibly resolve merge conflicts since, unlike jj fix
, this rebases instead of reparents<list files in B> | xargs clang-format
jj git push --change B
It's a massive pain in the ass, and still ends up rewriting history, so I'm confused about your opposition to jj fix
rewriting history. Mind you, I really don't like the way it's currently implemented, with a -s
flag (I'd much prefer -r
to specify the set of revisions you want precisely). Maybe that'd appease your concerns somewhat, given that the history rewrite is limited to the revisions that you specify precisely.
Also, how do you feel about jj run
then (#1869)? That can run more complex commands, and seems extremely useful to me for updating goldens, for example.
@matts1 Your use case can be solved other ways: simply run your formatter and squash that into whatever revision you want (or leave as a separate commit with its own message since it is a separate change).
I guess my opposition comes from a viewpoint of the VCS being invoked only when the code is ready to share, and that the historical states are those already shared.
I just don't see why a VCS has anything to do with batch processing of file contents.
Your use case is reasonable, but it should be solved by jj run, so I don't see a problem with it not being solved by jj fix.
I contest the design and argue that jj fix
should also handle it as per https://github.com/martinvonz/jj/issues/3808 (and specifically not jj run
). That being said, I am generally favorable towards some kind of solution that offers redoing/throwing away on a file-level granularity even for module-level fixes because I've seen it work well in practice, so there might not be too much conflict regardless of whether we make jj fix
support working-copy-level fixes.
Your use case can be solved other ways: simply run your formatter and squash that into whatever revision you want [...]
One would hope so, but I've found that it doesn't work that well in practice due to meaningless conflicts:
I guess my opposition comes from a viewpoint of the VCS being invoked only when the code is ready to share [...] I just don't see why a VCS has anything to do with batch processing of file contents.
Generally speaking, there are two cohorts of VCS user when it comes to this kind of thing:
jj run
/jj fix
are mainly useful to the first cohort. Both of these commands are based on commands from other VCSes, so many think that both are generally useful — but it also means that the design hasn't been widely discussed and motivated, so it's often not obvious to the unfamiliar user what the advantages might be.
jj run
that could help motivate it.git-branchless test
docs detail one interface, which isn't directly motivating, but could shed light on potential use-cases.@joyously I'm happy to discuss more but I think we should move to a separate discussion of the general suitability of jj fix
/jj run
command for a VCS, if you want to start one. (I am personally a big proponent, having implemented the variant for git-branchless
😁)
Your use case is reasonable, but it should be solved by jj run, so I don't see a problem with it not being solved by jj fix.
I contest the design and argue that
jj fix
should also handle it as per #3808 (and specifically notjj run
).
I did not follow this discussion closely but can you explain what the semantic difference here should be? I heavily agree with @matts1 that this functionality is covered by jj run
.
@joyously My take is that jj
should be fully supportive of rewriting history as much as you want. There's nothing wrong with your approach, but there's also nothing inherently wrong with rewriting history.
The reason there's nothing wrong with rewriting history is that if rewriting history for a commit was truly bad, then it should go into your immutable heads. For example, if you, for whatever reason, didn't want to support amending a commit that had been sent to github for review, just put remote_branches(remote="github")
in your immutable heads.
FWIW, I think this may also be related to the code review system people use. Your principles you mention are probably a very good set of principles for working with a code review system where you review PRs, and especially so if they don't allow rewriting history. However, rewriting history is the recommended approach for both gerrit and piper, because both of these systems have you review individual commits, where each commit has a stable change-id. So I can send out commits A and B, get a review requesting changes on A, jj edit A
, then jj piper upload B
to upload both the amended A and the auto-rebased B.
I do agree with @arxanas that there are definitely two classes of VCS users. During the development process, I'll be doing rebases, amends, octopus merges, squashes, editing intermediate commits, etc. After code review, I continue doing all that, rewriting history on existing commits already sent for review. I only stop rewriting history once a piece of code has been submitted.
I think the metrics we should be marking against are:
Is it going to create problems?
I think what bothers me about a fix
command is
immutable()
or mine()
. How would you know if you affected something you didn't intend to, when you enter the wrong revset?undo
and a concurrent usage?I haven't followed the whole discussion, but I wonder whether this could be addressed by some version of --verbatim-rebase
from #1027, similarly to jj amend --verbatim-rebase
.
Conceptually (and perhaps for real), I wonder if the version of jj fix
that creates a child commit for each commit it processes should be considered as the "most basic" version of the command. Then, the question of how and whether to squash those commits into parents becomes a separate question.
Is it going to create problems?
I think what bothers me about a
fix
command is
- It sounds like "fixing a bug" in that some tools have keywords to close the corresponding bug ticket.
I don't have any particular qualms with the name, but I wouldn't be opposed to renaming it to something like jj format
.
- Running an arbitrary tool in batch doesn't update the descriptions (or does it?)
It does not (and it shouldn't, considering the use case). This isn't for arbitrary tools, but specifically for formatters.
- It is not clear that there are any boundaries such as
immutable()
ormine()
. How would you know if you affected something you didn't intend to, when you enter the wrong revset?
I don't see how this is any different from any other command that affects multiple revisions (eg. rebase). And the boundary on immutable is simply that jj won't let you edit an immutable commit. That being said, re "how would you know if you affected something you didn't intend to", the only time it should ever be a problem to change history by running a formatter on a change and changing history is if that commit should not be changed at all, and those commits should just be marked as immutable. If I accidentally ran a formatter on a commit I didn't intend to, my code looks prettier but still runs the same. As proof of my earlier statement:
$ jj fix -s main
Error: Commit 63b808e40000 is immutable
Hint: Pass `--ignore-immutable` or configure the set of immutable commits via `revset-aliases.immutable_heads()`.
- Is the whole file affected, or is it filtered to only the lines that are in the diff?
The whole file - formatters generally work on a file level. I don't think there's anything wrong with this. If I ran clang-format
, it's not going to run on only the changed part of the file. This command should be no different from running it manually.
- At what point is the snapshot taken, or are there more than one? Is this compatible with
undo
and a concurrent usage?
I'm no expert on this, but I'd imagine once before the command is run, and once after the command completes. I don't see how it's any different from any other command that rewrites history (eg. jj squash
or jj rebase
) and a concurrent usage though?
- Aren't there already commands to propagate edits to descendants?
This is an extremely common use case, and to the best of my knowledge, there's no easy way to achieve it.
Conceptually (and perhaps for real), I wonder if the version of
jj fix
that creates a child commit for each commit it processes should be considered as the "most basic" version of the command. Then, the question of how and whether to squash those commits into parents becomes a separate question.
I strongly disagree with this one. While I do like the exisence of --verbatim-rebase
, I think it doesn't make sense here:
I strongly disagree with this one. While I do like the exisence of --verbatim-rebase, I think it doesn't make sense here:
I am not suggesting this should be the default behavior. We can also leave it at "conceptually" and not do it in practice at all.
My main point is that we can think of this command as doing two relatively orthogonal things. We can think separately about the options for how to "fix" the commit, and for options of how to then squash or integrate the new patch into the commit graph.
This also suggests to me that solving this in parallel to squash --verbatim-rebase
(or whatever we decide to call it) might be reasonable.
I cannot think of anything that a user could possibly do with it other than just immediately squash them in to create the linear history that jj currently creates
I can think of a few possible things: squash normally, squash with --verbatim-rebase
(this is effectively what has to happen if the child commit is also fixed), rebase the other descendants on top of the new commit, or rebase the other descendants on top of the new commit with --verbatim
.
It bothers me a little (unrelated to your comment) that whether or not using --verbatim
is natural seems to depend a lot on whether the descendants are also being fixed or not.
Formatters inherently are not something that a human reviews and says "are the changes good". You just blindly apply a formatter and don't need to verify them.
This is true, I'm perhaps polluting this discussion by thinking about jj run
, e.g. using LSP to rename a variable as the "fix". TBH, I'm not sure where the boundary lies and which command that usecase belongs to.
I can think of a few possible things: squash normally, squash with
--verbatim-rebase
(this is effectively what has to happen if the child commit is also fixed), rebase the other descendants on top of the new commit, or rebase the other descendants on top of the new commit with--verbatim
.
I do agree with two of those three use cases. I don't currently like that only the first option is currently supported - I personally like the idea of a -r
option (which is what this issue was originally about), because it would allow your third option to happen. I can't see a world where option 2 is useful, but if someone did come up with a reason I wouldn't be opposed to it.
With all that said, I'd much prefer jj fix -r
or jj fix --regular-rebase
over jj fix --new
, then jj squash
manually. I'd prefer for the complexity to be in the command implementation, rather than what the user has to run.
I personally like the idea of a -r option (which is what this issue was originally about), because it would allow your third option to happen. I can't see a world where option 2 is useful, but if someone did come up with a reason I wouldn't be opposed to it.
I may have misunderstood what this issue is about. My first guess at what the -r
option for fix
meant would be to do option 2 for the children of the heads of the revset being fixed. So, we could call it jj fix X --verbatim-rebase
or something along those lines (though I'm not sure about it, it just seemed like an option nobody mentioned). I didn't think my options 3 or 4 were being considered here. I'm now a little confused, but perhaps that's because I jumped in mid-conversation. Perhaps things will be clearer if I reread the thread more carefully.
With all that said, I'd much prefer jj fix -r or jj fix --regular-rebase over jj fix --new, then jj squash manually.
Yes, I never meant the user to be forced to squash things manually. If we implemented manual squashing at all, the users would have to opt into it.
@hooper What was your intention here? I had assumed that we would never want to rebase when we run jj fix
because that would create conflicts, and thus my assumption was that jj fix -r <revset>
would mean "fix the revset, then reparent all children of that revset", but it appears that @ilyagr had thought otherwise.
Yes, I think this was my confusion.
I assumed that jj fix X
would rebase descendants of X onto the new version of the commit, as every other command in jj
would, and that the request of this issue would be to have an option to do a "verbatim rebase" (which a lot of people seem to call reparenting).
In reality (as I'm sure most people reading are well aware), jj fix X
doesn't exist at the moment, there is only jj fix -s X
.
It is true that "reparenting" behavior seems generally more useful for a jj fix
command, so I can see why one might assume we'd do that by default.
I'm unsure what's best. I think reparenting by default might be reasonable, but it has the problem that it's confusingly different from how all other jj
commands behave. I'd be happier with it if there was some way it was communicated to the user (via the name of the option, say), but I'm not insisting on this either.
It occurs to me that this sounds an awful lot like a pre-commit hook.
It does have similarities, but pre-commit hooks don't make sense in jj
(see my section on pre-commit hooks in https://github.com/martinvonz/jj/issues/3577#issuecomment-2087816381), since jj
can't tell when you intend to commit. In that thread, we basically determined that since rewriting history is so easy in jj, all pre-commit hooks can easily just be done as pre-upload hooks instead, and there's no disadvantage to doing so.
So the workflow for most users will probably be:
jj fix
jj lint
? This might be done via jj fix
or jj run
, depending on whether the linter does things a single file at a time.jj git push
/ jj piper upload
/ jj gerrit send
Problem:
Some users find that
hg fix -s X
formats more files/lines than they wanted to be formatted inX..
.Solution:
If the user specifies
-r X
instead of-s X
, don't include the diffs ofX..
in the files/lines to be fixed. The details will depend a lot on how we implement line ranges. As for the set of files to be fixed, it basically means we would skip theparent_tree.diff_stream()
section ofjj fix
when pickingToolInput
s forX..
.Alternatives:
We fix files in
X..
, instead of rebasing them, to avoid merge conflicts between the user's changes and the code formatter's changes. I consider rebasing to be a worse alternative, because those conflicts are both confusing to deal with (especially if a diff tool ignores whitespace changes) and not very useful. Rebasing is potentially faster than some code formatters, but not enough to justify exposing users to unnecessary conflicts.Context:
Anecdotally, Google has had many users of
hg fix
over many years, and they don't ask for the rebasing behavior. They do occasionally ask for something like this FR.