martinvonz / jj

A Git-compatible VCS that is both simple and powerful
https://martinvonz.github.io/jj/
Apache License 2.0
9.39k stars 323 forks source link

Slow operations on very large repos #1841

Open chriskrycho opened 1 year ago

chriskrycho commented 1 year ago

Description

Right up front I want to acknowledge: (a) this is definitely an unusual situation, and (b) I totally get that it is likely to take a bit to sort through. But: I tried out Jujutsu on a very large repo from work a few minutes ago and found it's distinctly not yet ready to use there:

Command Time
jj init --git-repo=. 4m 59s
jj status 25s

(I'll add more operations to this list once I'm actually back at work in August!)

For scale: this repo has on the order of 3M LOC checked in—primarily JavaScript, TypeScript, and Handlebars, but with a mix of Java and Gradle as well, with a massive node_modules directory and a not-small bucket of things related to Gradle (both gitignore'd buuuut still massive) and it has hundreds of thousands of commits in its history, hundreds of active branches… and, annoyingly, also hundreds of thousands of tags (one for each commit; better not to ask).

For comparison, git status takes a second or two (again, I will time them when I'm back at work). I'm not using a sparse checkout here (other folks sometimes do, but for various reasons it's a non-starter for me :weary:).

Comparable open source repos might be something like Firefox or Chrome? I tried DefinitelyTyped, and its 3M LOC and mere 84,275 commits only took 9s to initialize and jj status took around a second. Even so, the comparable scale of the codebase itself and dramatically better performance suggests there may be something repo-specific (the tags?) causing the issue.

Steps to Reproduce the Problem

  1. Check out a massive repo with git.
  2. Initialize it with jj.
  3. Run operations on it.

Expected Behavior

It completes in a reasonable amount of time.

Actual Behavior

It completes in what honestly probably is a reasonable amount of time given the sheer scale of the things, but in a way that makes it much worse than Git for the moment.

Specifications