Open arxanas opened 3 months ago
I'm partial to the run/fix distinction being equivalent to the per-working-copy/per-file distinction, but mostly due to familiarity with such an implementation for hg
. I wonder if there are other granularities we should consider, and how that impacts our UI choices?
The --exec
flag feels like an undesirable Git-ism, but I think we agree that a terse way of specifying a command is important in both the per-working-copy model and the per-file model. hg fix
is lacking this, although command aliases can bridge some of the gap. It's more natural in the per-working-copy model because less configuration is needed (or maybe some of the info is instead communicated by the subprocess running jj
commands).
The --fix
flag feels like a bool where we need an enum. If jj run
has such an enum, --exec
probably doesn't belong in jj fix
. Along with a way to determine how much (none, all, a file set, ...) of the specified commit gets updated as a result of a command, the ways of handling its descendants include:
I should read more about git-branchless-test fix
. When I find the time, I'll write up my preferred mapping of use cases to CLI syntax.
Do you see either fix
or run
as related to some sort of bisect
? How about blame
?
Do you see either fix or run as related to some sort of bisect? How about blame?
Yes. Here's my most common workflows:
In practice, many (all?) of these benefit from configuring the search behavior (git-branchless-test supports the following behaviors):
The majority of my usage is on my local commit stacks, not historical commits, so I don't have much to say about the relation to blame
, except that I am indeed trying to attribute behavior to a certain change.
I'm partial to the run/fix distinction being equivalent to the per-working-copy/per-file distinction, but mostly due to familiarity with such an implementation for hg. I wonder if there are other granularities we should consider, and how that impacts our UI choices?
I don't see why a user would consider fixing via per-file or per-working copy to be relevant for the UI.
jj fix
and others had to be done via jj run --fix
.
The main distinction for me as a user in practice is whether a fix is "slow" or "fast".
git test fix -x 'cargo clippy --workspace --fix'
always works and could be a pre-defined tool/configuration,git test fix -x 'cargo clippy -p <package> --fix'
if I know which package I want to operate on and save myself some time (if I'm sensitive to that).Per-file vs per-working-copy fixes are roughly correlated to be "fast" vs "slow", but not always.
cargo fmt
is fast enough that it's not worth the time to manually or automatically configure it to run on only changed files.cargo update
technically requires looking at all the Cargo.toml
s in the workspace, and fundamentally not able to operate per-file, but it's fast enough in my repo that I don't care.I surveyed a couple of users in Discord about their mental models.
NOTE: The transcripts below are not verbatim; I removed and reordered messages to try to follow the individual threads.
@arxanas: When you consider a command called jj fix, what is your first impression in terms of what the command would be able to do/what the scope would be?
@arxanas: Do you expect that modifications dumped into the working copy with jj run will be automatically reflected in the caller (like it'll do the same rewrites as jj fix), or would you have to do something else to indicate that you want the changes to be reflected outside the jj run working copy? (Or, opposite, do you expect that there are cases that jj run changes to the working copy should be discarded/ignored?) Like do you imagine yourself using jj run with commands that don't modify the working copy?
@arxanas: By default, would you expect jj fix to operate on all files in the working copy, or only files changed in the commit?
@tingerrr responses:
@tingerrr: I would expect it to be all of them
jj fix
paradigm.
jj fix
as running on entire revisions by default.
jj fix
is seen as a more ergonomic or efficient way to accomplish per-working-copy tasks that could still be done with jj run
; that is,jj run
is a different level of abstraction than jj fix
.jj run
to apply fixes without opting in.Compared to my original mental model:
jj fix
to be the same fundamental kind of thing as jj run
.
jj fix
should be able to materialize working copies in some mode, to reduce user surprise/friction here. I don't see indication that users consider them to be specifically different in terms of "per-file" vs "per-working-copy" operation.jj run
use-cases easier, but not necessarily that it's incompatible with jj run
use-cases.jj fix
as being at a higher level of abstraction than jj run
.
jj fix
and jj run --fix
, which I was against in my original post here.run
and fix
at the same level of abstraction.Thanks @arxanas for summarizing all that. I obviously haven't thought about this as deeply as y'all have, but if I were building these features, I think this is how I would go about it:
jj run
for all use casesI think of jj run
as being the Swiss army knife that can do anything, so it would be implemented first with a suite of options:
npm install
), or some projects might be dependent on the absolute file system pathrun.tools.prettier.command = ["prettier",...]
, but you can also run one-off commands with --exec
or similar.I think this supports most use cases, but is probably slow because it runs sequentially and materializes everything.
Example:
# Run an ad hoc command
$ jj run --amend-changes --exec 'dotnet format'
# Run a pre-configured command
$ echo '
[run.tools.prettier]
command = ["prettier"]
amend-changes = true
' > ~/.jjconfig.toml
$ jj run --tool prettier
# Run unit tests, display failures somehow on non-zero exit code
$ jj run --exec 'npm test'
Since some projects/tools don't care about their absolute path, add an option jj run --parallel
that creates temporary working copies and executes the tool there. There could be options for how many to run in parallel.
$ jj run --tool prettier --parallel
# configuration: `run.tools.prettier.parallel = true`
Some tools won't need a working copy, so it would be nice to avoid that to make everything faster. Since this is so different from the other execution mode, I am indifferent between:
jj run --mode file-filter
(and run.tools.NAME.mode = "file-filter"
) to execute the tool like a Unix filter, orjj fix
which does thisIf jj went with --mode file-filter
, I would probably just create an alias named "fix" that does that.
My first impression of jj fix
is that it seems weird and the only thing to fix is some sort of meta data integration with a bug ticket system, marking a commit as a bug fix.
I don't expect that jj run
would make any modifications to the working copy since it would be "running" on a revset, not the working copy. I suppose the actual command that is run could make modifications though. Perhaps a flag should be used to indicate what to snapshot. I assume the working copy is untouched because I expect that I could run simultaneous jj run
commands in different windows.
I don't think jj fix
makes any sense, but I would expect that it would take parameters for what to work on. (It seems that running formatting tools or whatever you are fixing is the responsibility of other tools, not version control.)
To be clear, the main reason jj fix
exists right now is because it is immediately valuable to Google, even in its nascent form. That jj run
isn't fleshed out yet is partly because we're discussing it, rather than letting Google (me) throw more specialized code over the fence, which I think is a good thing. We should not consider any of the work done to date on jj fix
to be a constraint; Google will be fine as long as our users can run stdin/stdout formatters on changed files (and, preferably, changed line/byte ranges) "quickly" by typing jj fix
, regardless of the implementation. I think we can reach a similar outcome for Google's use cases for its internal hg run
(which is interesting in its own right, but mostly for the proprietary integrations, and not so much for its tremendous insight into the problems/opportunities we're facing with jj run
).
Fortunately, I think we have a lot of convergent thinking on many aspects. I keep seeing hypothetical statements that sound like what I've been thinking.
I've attempted to arrange the many desired behaviors into a small-ish set of orthogonal-ish dimensions, in the spirit of making jj run
the "swiss army knife" of this feature cluster. Hopefully each dimension serves to map directly between a UI widget (like a single flag or GUI element) and a corresponding aspect of the implementation. This should allow for Google's jj fix
to be expressed as an alias for a customized jj run
invocation, making the layers of abstraction clearer, and keeping the unfortunate string fix
as far away from out-of-the-box JJ as possible.
I also hope that this provides a framework for phased implementation, where we implement the default behavior of each flag first, and fill in combinations over time.
Dimensions with some hypothetical enumerations:
--parallelism=1
).clang-format
during clang-format -i $path
; this might be redundant with the snapshotted working copy modes)jj fix
)jj run bash
; weird/messy for parallel execution)jj
's stdout. Google uses something like this with hg
to good effect.--all-files
or --changed-files
:
jj fix
-like behavior). This can be based on either a diff against a set of common ancestors, as in hg fix
, or on incremental diffs as in jj fix
. Those are not semantically equivalent.jj fix
)jj run <options> [--] <argv>
or jj run -r foo --exec '<argv>'
. We need to decide if positional arguments are used for argv, file patterns, or otherwise.jj
builds to send RPCs. The builtin could be identified by a string that gets passed to a flag like --exec-internal
.jj run -r foo --tool tool1 --tool tool2
. Maybe this is the same flag as the builtin. The configuration would define which tools/functions/RPCs are applicable to which execution scopes/modes.I'm not sure what the flags for each dimension should look like exactly. We can either make exclusivity obvious with --thing=foo|bar, or make it sugary with --foo-thing|--bar-thing. I'm not sure if we have a strong precedent for this in the CLI. In fact, that may even be a reason to not design it this way.
I have some thoughts on how this would map to implementation, but I wanted to post this before I spend too much time on that. I think it could make good use of a planning phase that considers the flags and generically feeds instructions into a queue that is consumed by a worker pool that doesn't concern itself with flags. Instructions would include some arrangement of things like materializing working copies, executing subprocesses in working copies, passing file contents through subprocesses, rewriting commits with a FileId
->FileId
map, etc. with some mechanism for data dependencies between them.
As an aside, this is a complicated feature. I think it is worthwhile in its entirety, but we should honestly consider if it's not. If we scope it down, we cede more territory to third party extensions, and allow the functionality to develop in a more fragmented way (for better or worse). Maybe the incoming governance structure can weigh in on that.
If existing tools have run
or fix
, are their designs robust enough to use for jj
, or do you need to slog through an entire design process for this? Can you use the best of what already works?
@joyously posted responses to the questions here.
@emilazy posted responses to the questions in the Discord, reproduced below:
@arxanas: When you consider a command called jj fix, what is your first impression in terms of what the command would be able to do/what the scope would be?
@arxanas: Do you expect that modifications dumped into the working copy with jj run will be automatically reflected in the caller (like it'll do the same rewrites as jj fix), or would you have to do something else to indicate that you want the changes to be reflected outside the jj run working copy? (Or, opposite, do you expect that there are cases that jj run changes to the working copy should be discarded/ignored?) Like do you imagine yourself using jj run with commands that don't modify the working copy?
That last point about "seeing changes made in previous commits" is important. It could either be considered as another dimension, or part of the "scope" dimension from my last comment. jj fix
is basically "not seeing previous changes" under the assumption that it doesn't matter for a category of tools including formatters. You could also consider jj fix
a hybrid because of the diffing used to target changed files and avoid rebasing. I think we would want to be careful to avoid n^2 auto-rebasing behavior in the "see previous changes" mode (this is one way where jj run
can be more useful than the equivalent shell loop, and could be useful to document in jj help run
).
A couple of things seem clearer now:
fix
is a bad name. We kind of knew that already, but it seems confusing enough that any historical reasons for using it should be abandoned. We could consider something like jj format
in the future if it's possible to agree on some sensible defaults. Shipping it as a no-op is a bit weird.jj fix
are a little too monorepo focused for many users' expectations. Tucking it away as a variation of jj run
probably makes sense in the long term, from the user perspective and technically.I also generally agree with the idea that --edit
by default is too risky to be net useful, even assuming it plays well with jj undo
.
I don't think anyone contradicted this, but I want to point out that --edit
only implies serializing parents and children; we could still parallelize branches, even if they merge.
Another rabbit hole is that --edit
implies "topological execution order", but later we may also want to consider some notion of predicting the runtime for each revision, so we can optimize that order for a shorter critical path. That's complicated, because it interacts with external things like build system caches, but it will come up eventually. I wonder if we can get ahead of that kind of thing by implementing a more general bin packing for resources like RAM, cores, and wall time?
Is your feature request related to a problem? Please describe.
jj fix
(/hg fix
) are currently specialized towards formatters. However, these formatters have to operate in-memory on single files, which limits the utility somewhat.For context,
git-branchless-test fix
is implemented to operate on working copies, which I find quite useful in practice.There are several cases where you would prefer the full flexibility of running in the working copy:
jj fix --exec 'cargo fmt'
without any complicated configuration.pre-commit
checks.jj fix --exec 'cargo clippy --fix'
.git-branchless-submit
works with Phabricator: it runsarc diff
and actually amends the commit message with the effective change ID (the Phabricator URL).Describe the solution you'd like
jj fix
can operate in the working copy, similar to existing designs forjj run
. It would ideally be able to run in the current workspace, or in parallel across multiple workspaces.Describe alternatives you've considered
--fix
flag tojj run
instead. That's kind of weird because "fixes" are more or less the same thing to the user, so there's no reason to arbitrarily split them up into two different subcommands.