Open jmiserez opened 3 years ago
I've updated the issue with better numbers:
git status --ignore
in 2.32.0.So while our AV (Trend Micro) does have a large 4x impact, the issue itself looks to be responsible for a ~157x impact in this case, when comparing Git for Windows 2.32.0 vs Git for Windows 2.26.1.
@jmiserez could you revert https://github.com/git-for-windows/git/commit/515ff6a7d597fd8083c63a983f624ccd78bb2a4c and test the result? You have multiple options how to do that:
sdk cd git
,git revert 515ff6a7d597fd8083c63a983f624ccd78bb2a4c
make -j$(nproc)
./git --exec-path="$PWD" -C <directory> <command>
?or
git revert 515ff6a7d597fd8083c63a983f624ccd78bb2a4c
.github/workflows/git-artifacts.yml
by inserting a push:
line before the workflow_dispatch
trigger@dscho Thank you very much for the help. I assume you meant https://github.com/git-for-windows/git/commit/2f55cc471e41028d08bd6ac33c25c1b587bb2660 rather than https://github.com/git-for-windows/git/commit/515ff6a7d597fd8083c63a983f624ccd78bb2a4c?
Either way, neither of the 2 commits is the culprit. And my initial hypothesis at https://github.com/git-for-windows/git/pull/2637#issuecomment-878931871 was wrong. I've removed the PR link from the issue above and edited the title.
But I found the offending commit: https://github.com/git-for-windows/git/commit/8d92fb292706fd8d13cfe55353b2ec9345153a3e ("dir: replace exponential algorithm with a linear one") is where the performance regression happened.
So this looks like it isn't actually a Git for Windows specific bug, but unfortunately only manifests on Windows. At the moment I don't know exactly which part of the code is the problem, as it's quite a substantial rewrite. I think we really need an example/test repository to get to the root cause, I'll see if I can come up with one that I can share. And maybe the original author of that commit has an idea of what happened.
To make sure I also tested the other 2 commits:
Sidenote: the downloadable SDK environment is a pretty awesome and worked smoothly right out of the box.
So this looks like it isn't actually a Git for Windows specific bug, but unfortunately only manifests on Windows.
I am not actually sure that this is true...
replace exponential algorithm with a linear one
This yields a couple of hits on the Git mailing list: https://lore.kernel.org/git/?q=%22replace+exponential+algorithm+with+a+linear+one%22
A couple of ideas: it might be that removal of the resolve_gitlink_ref()
function mentioned here, or it might be something else.
But I do wonder whether this really only manifests on Windows...
A colleague has built a repository that demonstrates the issue. It would be great if someone else could verify/reproduce this.
For the moment creating the symlinked structure requires a (portable) install of Node.js. I've added short install/uninstall instructions in case you/anyone else is interested in reproducing this on a machine. In the future it should be possible to create just the folders with a git testcase.
Run either from cmd.exe or Git Bash with GIT_TRACE_PERFORMANCE=1:
git status -> **fast**
git status --ignored -> **slow since 2.27.0**
To get rid of long path warnings:
git -c "core.longpaths=true" status --ignored -> **slow since 2.27.0**
It takes around ~3s on WSL, but several minutes on Windows. I have not yet tested a physical Linux machine but will do so shortly.
A quick analysis using Sysinternals Process Monitor shows that the syscalls look different for Windows (git.exe) vs. WSL (git) when running GIT_TRACE_PERFORMANCE=1 git -c "core.longpaths=true" status --ignored
. I'm not sure why the CreateFile REPARSE
calls aren't necessary via WSL, but I'm also not sure if that is actually the issue. Full disclosure: These screenshots are taken with AV enabled, as I currently don't have access to a "clean" Windows machine. So the timings are off by 4x here.
Git for Windows 2.32.0 git:
Git on Ubuntu WSL 2.32.0:
(Scroll to the bottom for the Github issue boilerplate)
Problem description
Summary
2.27.0 seems to have introduced a performance regression which was not present in 2.26.1 and is not present in the Linux git clients, but is present in Git for Windows 2.27.0 and higher. Namely extremely slow performance (15.1s in 2.32.0 vs. 0.09s in 2.26.1) when running
git status --ignored
on a repository with deep folder structures (~25 folders deep) containing symlinks (specifically pnpm folders).git status
without--ignored
works fine (~0.04s), as the folders in question are then ignored.git status --ignored
on Linux via WSL (same filesystem, machine, folder) runs fine (~0.55s) as well. The same with the old Git for Windows 2.26.1 runs fast as well (~0.09s).In my tests, I've reproduced the issue when running Git for Windows 2.27.0 and later on Windows 10 (on a Windows filesystem), but not when running the identical Linux/Ubuntu version on the same Windows filesystem (via WSL). The issue is present for a large number of users in our organization across many different Windows 10 machines, but is never present when using Git for Windows 2.26.1 or earlier on the same machines or when using Linux git. The issue is present even with AV disabled.
EDIT: Repo demonstrating the issue here: https://github.com/git-for-windows/git/issues/3318#issuecomment-882048836 ~I'm sorry that I don't have a specific repository/project to demonstrate this issue, but it does happen across tens of users and several repositories at our organization. We have reduced the depth of our pnpm folder structures as much as possible (was around 100 before), but the issue remains as you can see in the traces below. It's pretty hard to come up with a public repository that replicates the issue cleanly enough to demonstrate, but maybe this information gives you an idea of what could be wrong.~
Traces: Git 2.32.0 Windows vs git 2.32.0 via WSL, no AV
This is with the newest Git. The repo contains symlinked pnpm node_modules directories ~25 folders deep. AV is disabled.
Git for Windows (git version 2.32.0.windows.1) via Git Bash:
Traces: Git on Ubuntu (git version 2.32.0) via WSL on same filesystem/folder:
Older traces from Git 2.29.0
In Git 2.29.0 there was also an additional "dir.c:2824" entry in the trace just before the last line, where the bulk of the time was spent. This isn't shown in the trace anymore with 2.32.0, but maybe this is relevant.
Traces: For reference Git 2.26.1 Windows on same repository
Github Issue boilerplate info
Setup
could not open directory '...': Filename too long
(core.longpaths false) the message is thencould not open directory
...Function not implemented
(core.longpaths true). The issue is present regardless.Details
Issue is present regardless of terminal. Specifically both from Git Bash and also when launched directly by IntelliJ.
Running
git status --ignored
on a repository with deep folder structures (~25 folders) containing symlinks, specifically pnpm node_modules folders.Similar performance to Git for Windows 2.26.1 or any recent Git version on Linux.
Extremely slow performance (15.1s) when compared to Git for Windows 2.26.1 (0.09s), or any recent Linux version via WSL (0.55s).
Unfortunately at this time I have not been able to create a suitable public repository with such a deep PNPM folder structure, as many of the artifacts and repositories in question are not public.