Open chriskrycho opened 1 year ago
What's the timing with --ignore-working-copy
on jj status
? If it's anything like nixpkgs, the watchman support that landed in #1731 the other day is maybe worth attempting to use since it will (in theory) keep the snapshots proportional to working copy changes.
When you run a in colocated repo (created via jj init --git-repo=.
), then every command will import git refs from the git repo at the start of the command, and they will also export git refs at the end of the command. Since you said it varies significantly between repos, I suspect that's an important reason for the slowness your case. How does it perform if you give the jj repo its own working copy (so it's more like a git worktree)? You can do that with in git repo foo
with something like cd ..; jj init --git-repo=foo foo-jj
.
import git refs from the git repo at the start of the command
Given that there's "hundreds of thousands of tags" (dont ask, dont ask, dont ask) it seems as something that absolutely affects the status :)
So yeah, can't wait to know what it is with jj init --git-repo=foo
Given that there's "hundreds of thousands of tags"
git pack-refs
can also help if you're using unreleased jj version.
Okay, so this is interesting:
How does it perform if you give the jj repo its own working copy (so it's more like a git worktree)?
The init
operation took only ~4m instead of 5m, and doing jj status
took about 9s instead of 25s, so that seems like a big win, though still a long way from where I'd want to be. (Aside: doing this before running yarn
in the repo took about 2s, doing it after took 9s—possibly related to #1785?) Doing jj branch list
took 33s in the colocated repo and 9s in the non-colocated repo. Relevant: there are 🙈 4540 active branches. (Again: don't ask; it's right up there with the 112,625 tags. I have complained, to no avail. 😂)
The underlying performance seems to be pretty consistent across these: jj log
also takes 33s in the colocated repo and 9s in the non-colocated repo.
What's the timing with
--ignore-working-copy
onjj status
?
SUUUUPER fast. Hyperfine says 263.8±4.1ms. Also, zero difference on that between colocated and non-colocated.
…the watchman support that landed in https://github.com/martinvonz/jj/pull/1731 the other day is maybe worth attempting to use since it will (in theory) keep the snapshots proportional to working copy changes.
👍🏼
git pack-refs
can also help if you're using unreleased jj version.
Weirdly, jj log
drops to 17s after doing that in the repo… the first time, then returns to taking 33s after that. 🤔
Happy to keep providing details/etc.!
How does it perform if you give the jj repo its own working copy (so it's more like a git worktree)?
The
init
operation took only ~4m instead of 5m, and doingjj status
took about 9s instead of 25s,
So, a fair amount of time would be spent for importing refs. 25s - 9s = ~16s
.
Watchmain will help to reduce the 9s
part to a few hundred ms, I suppose.
git pack-refs
can also help if you're using unreleased jj version.Weirdly,
jj log
drops to 17s after doing that in the repo… the first time, then returns to taking 33s after that.
No idea what happened for the first run. The current (unreleased version of) jj
will filter out known tags without loading if .git/refs/tags
has migrated to .git/packed-refs
. This wouldn't help if tags point to non-commit objects, though.
This wouldn't help if tags point to non-commit objects, though.
In this case, every tag points expressly and specifically to a commit, because every commit on the trunk branch is tagged.[^why] So it should help, in theory… but in practice it does not seem to have done so. 🤔
[^why]: Somebody long ago decided that not only was trunk-based development where every version of a thing was deployable was good (sure, yep, I agree), but that the only good way to do that in Subversion was with tags (hmm, okay, maybe?), and then this behavior was preserved exactly when migrating to Git (Wait, you say, that's redundant with tags! Yep. And that is the sound of me sighing heavily every time I think about it), and also this behavior was applied to applications, not just libraries (okay, fine, tools only need to understand one thing, fine) and also there is no pruning of versions even when they are most of a decade old and literally could not be deployed. So here we are. this-is-fine.gif etc.
So, a fair amount of time would be spent for importing refs. 25s - 9s = ~16s. Watchmain will help to reduce the 9s part to a few hundred ms, I suppose.
Perhaps we could also point Watchman at the Git ref files/directories, so that we could at least skip importing refs when none of them have changed (or something more ambitious where we import refs selectively based on which files have changed).
Perhaps we could also point Watchman at the Git ref files/directories,
Maybe we can also save & compare .git/refs/{heads,remotes,tags}/**
directory stats if watchman isn't enabled.
I have a similar issue with the Fuchsia repository.
$ time jj status
Parent commit: 59a346bf5dfb [docs] Add missing docs for log-command
Working copy : 0c96d3dc092e (no description set)
The working copy is clean
________________________________________________________
Executed in 5.11 secs fish external
usr time 3.63 secs 0.00 micros 3.63 secs
sys time 1.48 secs 801.00 micros 1.48 secs
During this command, the snapshotting output shows it's spending a lot of time in out/
, prebuilt
which are in the .gitignore
file. Shouldn't jj
be ignoring these paths during the snapshotting process? jj untrack
doesn't seem to cause jj
to exclude these directories when running status
.
If I delete out/
, we get some help:
$ rm -rf out/
$ time jj status
Parent commit: 59a346bf5dfb [docs] Add missing docs for log-command
Working copy : 0c96d3dc092e (no description set)
The working copy is clean
________________________________________________________
Executed in 2.66 secs fish external
usr time 1.94 secs 15.99 millis 1.92 secs
sys time 0.75 secs 8.04 millis 0.74 secs
Ignoring the working copy helps a lot:
$ time jj status --ignore-working-copy
Parent commit: 59a346bf5dfb [docs] Add missing docs for log-command
Working copy : 0c96d3dc092e (no description set)
The working copy is clean
________________________________________________________
Executed in 361.97 millis fish external
usr time 292.45 millis 471.00 micros 291.98 millis
sys time 71.24 millis 189.00 micros 71.06 millis
Using the working tree method:
$ time jj status
Parent commit: 59a346bf5dfb [docs] Add missing docs for log-command
Working copy : 532445ea411d (no description set)
The working copy is clean
________________________________________________________
Executed in 1.67 secs fish external
usr time 1.13 secs 0.00 micros 1.13 secs
sys time 0.54 secs 805.00 micros 0.54 secs
That's not horrible, but you're left without a .git
directory which may be needed if you have a Git helpers (e.g. for interacting with Gerrit).
The help text for --ignore-working-copy
I don't fully grok.
If you want to avoid snapshotting the working and instead see a possibly stale working copy commit, you can use
--ignore-working-copy
. This may be useful e.g. in a command prompt, especially if you have another process that commits the working copy.
I'm unclear how the snapshot can go stale if jj
is the one updating it each time, even if jj
is in a different command prompt. I am guessing this might be covering if Git modifies the working copy somehow?
I don't mind aliasing jj
to have --ignore-working-copy
all the time, but the downsides of this isn't very clear to me.
NOTE: I am using a GCP Virtual Desktop and so the backing networked SSD isn't all that fast. The issue may be lessened if working with local nvme.
What to do? :)
PS I installed watchman
, built jj
with the feature flag and enabled in my config. I can see that the daemon is running. Does this mean I can now use --ignore-working-copy
all the time safely?
I'm unclear how the snapshot can go stale if jj is the one updating it each time
Well --ignore-working-copy
is exactly the flag to make jj not do that 🤷
The workspace is technically stale anytime you do any changes and not run any jj commands (without that flag).
Since any command snapshots the WC beforehand it never sees it being stale, but something like jj diff --ignore-working-copy
will not snapshot (aka amend to the WC) the changes you made since the last jj command and the diff (in that example) will not include those changes - hence we say the state it sees is stale.
If you understood my comment better than the help text feel free to suggest better wording :)
edit: ugh I see now that your question was related to watchman and you probably did knew everything I re-explained here :|
PS I installed watchman, built jj with the feature flag and enabled in my config. I can see that the daemon is running. Does this mean I can now use --ignore-working-copy all the time safely?
Whether you use --ignore-working-copy
is orthogonal to the availability of Watchman. It only means that working copy snapshots won't be taken. If a snapshot is taken and Watchman is available, then jj will use Watchman as a faster path instead of scanning the filesystem.
Make sure that you set core.fsmonitor
to watchman
in your repo as well (jj config set
). You should be able to confirm that Watchman is being used for snapshotting by invoking jj
with the environment variable RUST_LOG=info
. It should print a message saying that it is querying Watchman.
So, to be clear, Watchman doesn't cause snapshots to be taken when it notices that something changed, correct? It only keeps track of the changed files to tell jj
when it asks.
You should be able to confirm that Watchman is being used for snapshotting by invoking jj with the environment variable RUST_LOG=info. It should print a message saying that it is querying Watchman.
Good to know! I wish this was one of the jj debug watchman
commands. I can't tell whether jj debug watchman query-changed-files
working implies that jj uses watchman or not.
Excellent, I think I needed core.fsmonitor
in the repo too, I had it in my user config but I don't think that helped. Now it's working!
$ time jj status
Parent commit: 59a346bf5dfb [docs] Add missing docs for log-command
Working copy : 0c96d3dc092e (no description set)
The working copy is clean
________________________________________________________
Executed in 807.39 millis fish external
usr time 646.04 millis 621.00 micros 645.42 millis
sys time 175.54 millis 278.00 micros 175.27 millis
I will give that a try on the work machine tomorrow—I'd love to be able to start using it there, because after the last month Git feels janky as heck every time I use it. 😂
I posted in Discord https://discord.com/channels/968932220549103686/969291218347524238/1129516951706816532 but should post here as well:
Here's a tracing
profile of jj status
in nixpkgs
with Watchman.
Interesting segments:
snapshot
: 257ms
import_git_refs
: 53mstree_state
(reading it): 51msdeleting file states
(filtering out Git submodules from a list of files): 14msmake_fsmonitor_matcher
: 99msquery_watchman
: 84ms (still a bit much in my opinion...)finish
: 25mswrite_commit_summary
: 10msconflicts
: 424ms 😱cmd_status
is conflicts
, so I'm guessing the remainder is some Drop
implementation@martinvonz is working on tree-level conflicts which should take care of the biggest bottleneck. I think we can cut ~90ms if we stop storing file states in the tree-state proto for the Watchman case.
With some additional feature work, we could possibly reduce import_git_refs
somewhat by querying Watchman (might have to do it in parallel with snapshotting the working copy... actually, it would probably help to do them in parallel right now). The last 50ms of remaining time need more investigation. But then I think we could get status
down to an acceptable ~100ms.
Related: snapshotting adds significant overhead for jj status
compared to git status
—not unexpected, since jj status
does massive amounts more work, but noticeable, as it's a full order of magnitude:
$ hyperfine "git st" "jj st"
Benchmark 1: git st
Time (mean ± σ): 10.1 ms ± 0.4 ms [User: 3.7 ms, System: 5.7 ms]
Range (min … max): 9.5 ms … 12.9 ms 192 runs
Benchmark 2: jj st
Time (mean ± σ): 110.2 ms ± 1.9 ms [User: 52.1 ms, System: 56.2 ms]
Range (min … max): 108.1 ms … 116.1 ms 25 runs
Summary
git st ran
10.95 ± 0.44 times faster than jj st
This is just using the v0.8.0 mainline, no watchman
etc., and I have yet to run it instrumented via #1870. For context, the repo has ~3,000 commits and the (never packed AFAIK) .git
directory is ~73MB.
So, to be clear, Watchman doesn't cause snapshots to be taken when it notices that something changed, correct? It only keeps track of the changed files to tell jj when it asks.
@ilyagr That's correct. One way is to launch a daemon and use a Watchman subscription: https://facebook.github.io/watchman/docs/cmd/subscribe. Actually, it seems that Watchman has a trigger
system to do something like what you describe, which I didn't know about until now: https://facebook.github.io/watchman/docs/cmd/trigger
Related: snapshotting adds significant overhead for jj status compared to git status—not unexpected, since jj status does massive amounts more work, but noticeable, as it's a full order of magnitude:
TIL hyperfine accepts multiple commands to benchmark 🤣. It's worth noting that jj status
is entirely single-threaded (for now) while git status
is multithreaded. Ideally, raw jj status
should perform approximately as well as git status
.
Encountered this and can confirm that with:
jj config set --user core.fsmonitor watchman
Both the jj log
and jj status
become much faster in a large repo.
The work on changing how conflicts are stored is now pretty much done. You can set format.tree-level-conflicts = true
to use the new format. That should remove almost all of the time spent on conflicts in @arxanas's profile above, for example. However, note that the feature is still very new not tested much (all automated tests pass, though), and that it won't speed up access to existing commits nor commits imported from Git (I think @yuja is thinking of fixing that).
With #2232 merged, you should see significantly better performance in fresh clones of large repos. For example, I timed jj log | head -1000
in the Linux repo. That took ~13 s before and ~2.3 s after.
@martinvonz is it recommended to fresh re-clone a large repo?
I think that depends on how often you want to look at old commits. New commits will use the new format once you've set format.tree-level-conflicts = true
, but you'll need to re-clone (with a version built after #2232) to get the speedup on commits that are already in the git repo.
Does this mean watchman is no longer required for meaningful work on large repos?
On Fri, Sep 8, 2023 at 10:25 AM Martin von Zweigbergk < @.***> wrote:
I think that depends on how often you want to look at old commits. New commits will use the new format once you've set format.tree-level-conflicts = true, but you'll need to re-clone (with a version built after #2232 https://github.com/martinvonz/jj/pull/2232) to get the speedup on commits that are already in the git repo.
— Reply to this email directly, view it on GitHub https://github.com/martinvonz/jj/issues/1841#issuecomment-1712000827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAKVPD34P6QGXPIXNPPLH3XZNIHHANCNFSM6AAAAAA2FBNHKU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
No, it doesn't mean that. Watchman helps with snapshotting the working copy by keeping track of which files have changed between two consecutive snapshots. The tree-level conflicts makes it faster to determine which paths have conflicts (and, importantly, it makes it faster to determine when there are no conflicts).
I ran into a bug yesterday that's most likely caused by tree-level conflicts. I resolved a conflict in one commit and squashed the resolution into it. There were still some descendant commits that were shown as conflicted in jj log
, but when I inspected it with jj diff
, it said "Created conflict in CHANGELOG.md:" but the diff didn't contain any conflict markers. I'll try to find time to look into that soon (hopefully today).
Is it normal to see jj
initialize the watchman monitor every time it's invoked? It seems two consecutive jj log
calls are both initializing the monitor first
foo/bar (8e1c5d4) ❯ RUST_LOG=info jj
2023-09-13T17:24:19.622654Z INFO run:run_internal:run_command:cmd_log:workspace_helper_internal{snapshot=true}:snapshot:snapshot_working_copy:snapshot:make_fsmonitor_matcher:query_watchman:init{working_copy_path="/Users/foo/Projects/bar"}: jj_lib::fsmonitor::watchman: Initializing Watchman filesystem monitor...
2023-09-13T17:24:19.693706Z INFO run:run_internal:run_command:cmd_log:workspace_helper_internal{snapshot=true}:snapshot:snapshot_working_copy:snapshot:make_fsmonitor_matcher:query_watchman:query_changed_files{previous_clock=Some(Clock(Spec(StringClock("c:1694462839:76320:8:4422"))))}: jj_lib::fsmonitor::watchman: Querying Watchman for changed files...
@ sxxkulzr Chih-Wei Chang <2840571+lazywei@users.noreply.github.com> 1 minute ago 182b55a9
│ (empty) (no description set)
◉ kyvzrspo Chih-Wei Chang <2840571+lazywei@users.noreply.github.com> 1 minute ago cwc/pr-10833* HEAD@git 8e1c5d4a
│ ...
foo/bar (8e1c5d4) ❯ RUST_LOG=info jj
2023-09-13T17:24:21.341554Z INFO run:run_internal:run_command:cmd_log:workspace_helper_internal{snapshot=true}:snapshot:snapshot_working_copy:snapshot:make_fsmonitor_matcher:query_watchman:init{working_copy_path="/Users/foo/Projects/bar"}: jj_lib::fsmonitor::watchman: Initializing Watchman filesystem monitor...
2023-09-13T17:24:21.410058Z INFO run:run_internal:run_command:cmd_log:workspace_helper_internal{snapshot=true}:snapshot:snapshot_working_copy:snapshot:make_fsmonitor_matcher:query_watchman:query_changed_files{previous_clock=Some(Clock(Spec(StringClock("c:1694462839:76320:8:4428"))))}: jj_lib::fsmonitor::watchman: Querying Watchman for changed files...
@ sxxkulzr Chih-Wei Chang <2840571+lazywei@users.noreply.github.com> 1 minute ago 182b55a9
│ (empty) (no description set)
◉ kyvzrspo Chih-Wei Chang <2840571+lazywei@users.noreply.github.com> 1 minute ago cwc/pr-10833* HEAD@git 8e1c5d4a
│ ...
~
And it almost feels the fsmonitor doesn't help much in either call, it's not very slow but there is a noticeable delay.
I think that's just saying that we're initializing the connection to watchman. The process is still running between calls, right?
FWIW I only see one watchman process that is persisting across invokations.
On Wed, Sep 13, 2023 at 10:50 AM Martin von Zweigbergk < @.***> wrote:
I think that's just saying that we're initializing the connection to watchman. The process is still running between calls, right?
— Reply to this email directly, view it on GitHub https://github.com/martinvonz/jj/issues/1841#issuecomment-1718064294, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAKVPCEEKNKRSVZLHC6PZDX2HXABANCNFSM6AAAAAA2FBNHKU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I tried uninstall watchman and time the jj log
// First make sure no watchman is available
2023-09-13T18:00:58.277462Z WARN run:run_internal:run_command:cmd_log:workspace_helper_internal{snapshot=true}:snapshot:snapshot_working_copy:snapshot:make_fsmonitor_matcher: jj_lib::working_copy: Failed to query filesystem monitor err=Fsmonitor(WatchmanConnectError(ConnectionDiscovery { watchman_path: "watchman", reason: "No such file or directory (os error 2)", stderr: "" }))
// time `jj log`
hyperfine --warmup 5 "jj log"
Benchmark 1: jj log
Time (mean ± σ): 468.3 ms ± 8.4 ms [User: 426.7 ms, System: 531.7 ms]
Range (min … max): 458.6 ms … 487.9 ms 10 runs
And then I install watchman and make sure it's queried
2023-09-13T18:03:33.691071Z INFO run:run_internal:run_command:cmd_log:workspace_helper_internal{snapshot=true}:snapshot:snapshot_working_copy:snapshot:make_fsmonitor_matcher:query_watchman:query_changed_files{previous_clock=Some(Clock(Spec(StringClock("c:1694628176:36088:1:100"))))}: jj_lib::fsmonitor::watchman: Querying Watchman for changed files...
hyperfine --warmup 5 "jj log"
Benchmark 1: jj log
Time (mean ± σ): 462.2 ms ± 7.1 ms [User: 215.8 ms, System: 236.8 ms]
Range (min … max): 456.3 ms … 481.3 ms 10 runs
The jj log
's latency doesn't seem to change a lot? On the bright side it means I don't need watchman and it's fast but on the other hand git status
or git log
is around 100ms when measured with hyperfine.
If you're curious what's taking time, you can try profiling using e.g. samply. Just install with cargo install samply
, then run e.g. samply record jj log
and open the link it prints. Feel free to share a screenshot.
Wow, this tool is pretty cool. I was trying cargo flamegraph
yesterday and got a similar result.
The trace I got from jj log
: https://gist.github.com/lazywei/04f7fece398d01917e4d4b9209e2e6e5
And here is the samply profile: https://share.firefox.dev/3LpIbYO
Both of them seems to point to the git_futils_readbuffer_updated
takes a lot of time. Is this due to the natural of large git repo?
Ah, that confirms one of my suspicions - that importing refs from git takes a lot of time. When you're in a colocated repo, every jj
command will start by importing refs from git and end by exporting refs to git. If we didn't do that, HEAD and branches could point to different places according to git
and jj
, which would be very confusing. The only solution we have is to simply not colocate your repos. So that would mean either having your git working copy and your jj working copy in separate directories and manually running jj git import
and jj git export
when you switch between working in the git working copy and the jj working copy. If you rarely need to use git
commands, then that's not much of a problem.
I see. That makes sense. The reason I need colocate repo is because some of our team's scripts makes assumption on git, like git rev-parse --show-toplevel
and tagging for release etc. But that's a different problem so I would avoid derailing this issue. Thank you!
Both of them seems to point to the
git_futils_readbuffer_updated
takes a lot of time. Is this due to the natural of large git repo?
If you have tons of refs under .git/refs
directory, try git pack-refs
. It will reduce the overhead of automated git imports.
Another regularly scheduled performance update: I am testing an array of memory-related optimizations and changes in https://github.com/martinvonz/jj/pull/2503 — in some cases, for large repositories, these changes will improve performance by up to 2x i.e. the same operation will complete with the same output, while using 50% of the original wall clock time. This includes operations like jj files
and jj st
. The exact speedups will depend on whether or not your repos are colocated, whether or not you are using watchman, and whether or not you are using official binary releases or compiling from source. Some cases may "only" give you a 10-30% speedup.
I am going to keep iterating on this branch as I don't expect it to go in immediately and I will keep testing new changes. The goal is for every change to go upstream, and to do it piecewise. So you should consider this a publicly available testing branch, not a traditional PR. Some changes may or may not improve performance (i.e. they may only improve observability), but the goal is for every change to result in a net ~0% runtime increase, at the minimum.
If you aren't afraid to compile from source code, please give it a try and report back with the Commit ID of the jj
build you are using (as I will be rebasing the branch), as well as some basic details above: repository size, is it colocated, do you have watchman, what operating system you're using. I have only tested these numbers on Linux, so far; macOS and Windows users are welcome to try it as well — I'll eventually get around to benchmarking those either way as I have access to all 3 systems.
EDIT: Something like this should get you going:
cargo install \
--locked --git https://github.com/martinvonz/jj.git \
--branch aseipp/push-mwwotvxyruwp \
--bin jj jj-cli
is it reasonable to expect that operations like jj git push
would be extremely slow for co-located, large repositories even if all other operations are relatively snappy?
i'm using watchman
& i tend to only jj git fetch
the branches i need or am working on, but on average jj git push -b my-branch
takes about a minute:
jj git push -b jkachmar/some-example-branch 55.88s user 0.58s system 92% cpu 1:00.84 total
the repo i'm working with is pretty large, but i looked through the issues & haven't seen anyone specifically calling out jj git push
performance as particularly slow when everything else seemed to be pretty tolerable.
for reference, after running jj util gc
:
$ git count-objects -vH
count: 0
size: 0 bytes
in-pack: 1444949
packs: 2
size-pack: 6.79 GiB
prune-packable: 0
garbage: 1
size-garbage: 355.00 MiB
I also experience slow pushes on large repos (Nixpkgs).
Perhaps profiling the push using the suggestions from https://github.com/martinvonz/jj/blob/main/docs/contributing.md#profiling might indicate something?
I wonder if the safety checks from https://github.com/martinvonz/jj/pull/3522 might need optimizing of some sort, but that's just because it's the last thing I know changed with jj git push
recently.
finally got around to profiling this (is there a better way to export symbolized samply
profiles than just screenshots?).
so it looks like most of the time is spent in git_revwalk_next
..?
Does jj util gc
make it any faster?
a little: git_revwalk_next
takes 72 seconds before jj util gc
& 64 seconds afterwards
I wonder if it's the number of refs that's the problem. What does git for-each-ref | wc -l
say (add --git-dir .jj/repo/store/git
if your repo is not colocated)?
❯ git for-each-ref | wc -l
13058
That's not very much so it's probably not the problem. Is regular git push
(of a similar set of branches) fast? Assuming it is, perhaps the difference is that libgit2 uses some older version of the git protocol (maybe you can test with git -c protocol.version=1 push
) or maybe libgit2 just has a performance bug somewhere in the push code. Can you tell if git push
and jj git push
transfer a similar amount of data?
git push
is around 6 seconds
protocol.version=1
is around the same timejj
pegs one of my cores to 100% for the full minute w/ minimal I/O so it appears to be CPU-boundThanks for checking! Perhaps it's some performance bug in libgit2's push code then. I don't have any other ideas anyway.
Last time I checked, jj git fetch
had a similar problem in that it had heavy CPU-bound task before starting actual network I/O. It was because libgit2 unpacks commit object for each ref. I don't see this problem on jj git push
, but there might be some way to trigger it.
late follow-up, but fwiw jj git fetch
is similarly slow for me on this (very large) repository if I don't narrow to specific branches (e.g. jj git fetch -b main
).
zooming in on a section within the first indicated area, the profile is dominated by repeated calls that look like this under git_smart__download_pack
:
zooming in on a section within the second indicated area, the profile is dominated by unpacking, inflating, and hashing objects:
Description
Right up front I want to acknowledge: (a) this is definitely an unusual situation, and (b) I totally get that it is likely to take a bit to sort through. But: I tried out Jujutsu on a very large repo from work a few minutes ago and found it's distinctly not yet ready to use there:
jj init --git-repo=.
jj status
(I'll add more operations to this list once I'm actually back at work in August!)
For scale: this repo has on the order of 3M LOC checked in—primarily JavaScript, TypeScript, and Handlebars, but with a mix of Java and Gradle as well, with a massive
node_modules
directory and a not-small bucket of things related to Gradle (bothgitignore
'd buuuut still massive) and it has hundreds of thousands of commits in its history, hundreds of active branches… and, annoyingly, also hundreds of thousands of tags (one for each commit; better not to ask).For comparison,
git status
takes a second or two (again, I will time them when I'm back at work). I'm not using a sparse checkout here (other folks sometimes do, but for various reasons it's a non-starter for me :weary:).Comparable open source repos might be something like Firefox or Chrome? I tried DefinitelyTyped, and its 3M LOC and mere 84,275 commits only took 9s to initialize and
jj status
took around a second. Even so, the comparable scale of the codebase itself and dramatically better performance suggests there may be something repo-specific (the tags?) causing the issue.Steps to Reproduce the Problem
git
.jj
.Expected Behavior
It completes in a reasonable amount of time.
Actual Behavior
It completes in what honestly probably is a reasonable amount of time given the sheer scale of the things, but in a way that makes it much worse than Git for the moment.
Specifications