microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.38k stars 818 forks source link

git status slow on DrvFS #981

Closed nathanmerrill closed 5 years ago

nathanmerrill commented 8 years ago

I don't believe this is specific to git, but it exposes the problem nicely.

To do this test, I cloned Mono to /mnt/c/mono

When running time git status on WLS:

real    0m14.276s
user    0m0.375s
sys     0m36.656s

However, if I run the same command on Git Bash for windows, it runs significantly faster:

real    0m0.790s
user    0m0.015s
sys     0m0.031s

This problem is not related to the /mnt folder, as I get similar runtimes in ~/. I don't have an antivirus or firewall running (including Windows Defender).

I've also run bonnie++ to test my HD speed:

Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
TEST             8G           127291  58 85793  57           225356  63 446.2 455
Latency                         147ms     155ms             42017us     876ms

1.97,1.97,TEST,1,1472070956,8G,,,,127291,58,85793,57,,,225356,63,446.2,455,,,,,,,,,,,,,,,,,,,147ms,155ms,,42017us,876ms,,,,,,

I believe the problem lies with random seeks per second. I've run the same command on a different linux box with an SSD (my personal machine has an SSD as well), and I'm getting about 10 times less random seeks (but equivalent of everything else). I've confirmed this theory by running dd on both Git Bash and WLS, and I get roughly equivalent numbers there as well.

I'm running on Windows Version 1607 Build 14393.82

russalex commented 8 years ago

Thanks for the feedback!

There are some known issues with file system performance. We go into a little detail on our implementation on our Blog and that should shed a bit of additional light on the situation. Also, there are some discussions on filesystem performance (my favorites are #873 and the Phoronix article from a while back).

This is not an easy problem, but it is something we are working diligently to improve. Unfortunately I can't give an ETA, but we do have some things in the pipe which will have some impact.

nathanmerrill commented 8 years ago

I did see #873 as well, but my issue is different, as it occurs anywhere in the filesystem, not just under /mnt. I'd also like to note that I've talked with other people using WSL, and they aren't experiencing nearly the same amount of slowness I've gotten. I'm unsure exactly why machine is different, but there's definitely something there.

therealkenc commented 8 years ago

It appears to be a problem with VolFs (/home/me/chromium/src) not captured by the Phoronix tests. If you checkout say the chromium tree with all branches and do a git status you're pretty much hooped. It takes about a minute give or take on Linux/Ext4 or Windows/NTFS with my spinning disk. On WSL/VolFs I've never been patient enough to see it return.

therealkenc commented 6 years ago

Updated the title because the original made cause assumptions, and many other issues have been duped into this one.

nathanmerrill commented 6 years ago

While my title may have been inaccurate, I don't think that "git status is slow" accurately describes the problem either, as it is just a symptom of the problem

therealkenc commented 6 years ago

Indeed. The new title merely follows the guidelines of CONTRIBUTING.md (example given: "traceroute not working"). It summarises the nature of the repro given in the OP, and is actionable. Presumably this issue# would be closed when the performance of git status on DrvFS adequately matches that of git status on say Cygwin or ntfs-3g on Linux (for some arbitrary value of adequate).

jakeg commented 6 years ago

Has there been any recent progress on this? I'm experiencing crippling performance (> 30s) when running git status under WSL, which should normally take < 1s in my repo.

piksel commented 6 years ago

Just to provide a temporary solution:

#!/bin/bash
# WSL 'git' wrapper, save as /usr/local/bin/git and chmod a+x

if [ "${PWD:0:5}" = "/mnt/" ]; then
  /mnt/c/Program\ Files/Git/bin/git.exe "$@"
else
  /usr/bin/git "$@"
fi

Result: image

lucasvdh commented 6 years ago

@piksel You're the best! How did I not think of this before.

sunilmut commented 6 years ago

@jakeg - We are constantly looking at ways to improve disk perf and are making regular investment in that space. If there is anything significant to report, we will. For best updates, you can monitor the WSL release notes here. We understand the ask here. Please bear with us here as this is a major undertaking.

MichaelTong commented 6 years ago

Here is an alternative script to @piksel just in case someone is using symlink directories like me.

#!/bin/bash
# WSL 'git' wrapper, save as /usr/local/bin/git and chmod a+x

REALPATH=`readlink -f ${PWD}`

if [  "${REALPATH:0:5}" == "/mnt/" ]; then
  git.exe "$@"
else
  /usr/bin/git "$@"
fi
er1c commented 6 years ago

The bash script wrapper is good, it's fast. The one thing I'm now missing is console colors, is there a fix for that?

piksel commented 6 years ago

@er1c This is not related to WSL per se, so kind of off topic, but you can force git to use colors with either --color=always or -c color.ui=always (depending on what command your issuing) i.e:

git log --graph --color=always
IkeTheDestroyer commented 6 years ago

You can run the following commands to get colors working, depending on what you want to colorize. log: git config color.log always status: git config color.status always diff: git config color.diff always

I am running the windows version of git with coloring and love the speed up.

er1c commented 6 years ago

@IkeTheDestroyer thanks!

I had to add a set HOME=%USERPROFILE% to the "Environment" tab in ConEmu to get around a bad windows-working directory, but adding --global to each of your commands works perfectly for me - and across all of my projects

yeganer commented 6 years ago

While the hack to use git.exe instead of /usr/bin/git does work for most applications, it is not an acceptable solution as it requires duplicating the configuration and does not allow me to use my configured editor.

Can someone comment if the performance stays the same with the Spring Creators update?

nesl247 commented 6 years ago

Has anyone tried https://blog.github.com/2018-04-05-git-217-released/#speeding-up-status-with-watchman?

samiraguiar commented 6 years ago

There's another problem: the first time that git-status is run on WSL after running it on Git For Windows shell it is painfully slow, much slower than consecutive runs.

strace indicates that on this first time, Git will see every file in the repository as untracked and stat, open and read them (but in the end nothing will be reported as changed). Next runs will be faster and this process will only happen again if you run git-status on Git For Windows' shell.

The index file is also changed between those cross-layer operations, so maybe the index is being messed up somehow (or there could be an option in Git to avoid this).

I couldn't investigate any further, but maybe that can help somehow.

(PS: I have core.filemode set to false and core.autocrlf set to true)

graemechristie commented 6 years ago

Is there an update ? Git seems like a pretty pivitol use case for WSL. Currrently, git is pretty much unusable on a windows filesystem.

mojjy commented 6 years ago

gets worse when you using a prompt that updates based on your git status

bitcrazed commented 6 years ago

Please follow https://github.com/Microsoft/WSL/issues/873 which is the primary disk IO perf issue thread

GenesisCraig commented 6 years ago

Has this been assigned to anyone in the development team? Git is one of my core uses for WSL.

bitcrazed commented 6 years ago

@GenesisCraig If, as suggested, you take a look at #873 - the main issue tracking this problem - you'll notice this comment from Sven Groot - the engineer responsible for much of WSL's IO infrastructure.

We appreciate that WSL's disk IO perf isn't yet where we want it to be, but know that we're working hard across several teams, deadlines, priorities, etc. to figure out some significant improvements.

Please bear with us.

therealkenc commented 6 years ago

Maybe just close and dupe this if we want to make #873 the landing zone. Your call.

bitcrazed commented 6 years ago

Possibly. What say you @tara-raj?

therealkenc commented 6 years ago

Or ask Sven. It depends mostly on whether there are multiple perf issues that should be treated as separate (because they'll be addressed separately) or whether it makes more sense to have a "yes we know WSL filesystems ops are slow we're working on it" thread. git is "special" in that it causes a stat(2) storm. Stuff like apt update or a big parallel compile or just a big tar xf (message) is slow on both DrvFs and LxFs too, but they aren't (necessarily) bounded by stat. Also ref #2626.

therealkenc commented 6 years ago

That said, after spending $7.5 billion dollars on github, this narrow ask could probably be fixed pretty quick with (I dunno) a ten or twenty million dollar engineering budget. Seems like rounding error to me. And what's with the Mac peeps getting the good stuff before us, huh? 😉

GenesisCraig commented 6 years ago

Just to put my 2-cents where it does not belong, I'm in complete agreement with @therealkenc in terms of real engineering budget needing to be allocated to WSL. What your team has done so far is amazing, but if WSL is intended to be a mechanism for wooing developers away from Macs/VMsfor their development environments, getting git, or for that matter any utilities that work on thousands of little files to perform well is key. Working on hundreds, thousands or millions of small-large files is de rigueur for programmers, analysts and sysadmins.

Giving the WSL team some funding/time/resources to be able really tackle the hard computing problems like disk I/O performance is essential to keeping WSL from becoming perceived as a half-hearted attempt to woo developers/sysadmins. It's not yet, but this type of thing could be a subconscious litmus test for many.

er1c commented 6 years ago

This is/was a core usage for me too. The talk about how the git.exe is able to have windows-specific file system optimizations definitely explains why it works so much better than the linux binary. The solution discussed in this thread almost does everything I need. I could never got the git log to actually pager/use less while in ConEmu - regardless of what git config --pager type of commands I tried :) If anyone else has figured that out, then I think the solution would be a viable work around in the mean time.

bitcrazed commented 6 years ago

@therealkenc - If only that was the way big org's worked ;)

@GenesisCraig - Appreciate your input, but as I (often) recommend, lets limit "pontification from the sidelines" esp. in relation to commentary on teams one is not involved with and/or have no knowledge of how said teams operate. If it was as simple as "throw a couple of devs at it", the problem would be fixed by now.

Know that the team is painfully aware of these issues, and is working hard with several partner teams with the specialist knowledge, understanding, skills, and mandate to work on the relevant improvements in up-coming releases. We are passionately committed to resolve this issue given time to work through the highly complex changes required.

bmayen commented 6 years ago

Sounds like you just don't want it enough! I kid, I kid ;)

Awesome work. Can't wait to see this all come together.

bitcrazed commented 6 years ago

@bmayen Bwaahhhahahh!

therealkenc commented 6 years ago

@therealkenc - If only that was the way big org's worked ;)

To be clear, it was completely in jest. There isn't anything to "agree" with me here. If thought it worked like that I would be on a private island, having earned $2.5b for banging out a game with crappy voxel graphics in Java.

amyers735 commented 6 years ago

@er1c I've found the lack of paging on git log to be a problem also.

Forgive my lack of "bash-fu" but I came up with this as an interim hack:

#!/bin/bash                                                             
# WSL 'git' wrapper, save as /usr/local/bin/git and chmod a+x           

REALPATH=`readlink -f ${PWD}`                                           
ARG1=$1                                                                 

if [  "${REALPATH:0:5}" == "/mnt/" ]; then                              
  if [ $ARG1 == "log" ]; then                                    
    git.exe "$@" | more                                                 
  else                                                                  
    git.exe "$@"                                                        
  fi                                                                    
else                                                                    
  /usr/bin/git "$@"                                                     
fi                 
dkrieger commented 6 years ago

This problem has plagued me for quite some time. The workaround I've opted for is to limit my working directory by setting git config core.sparseCheckout true and excluding directories I don't need by editing .git/info/sparse-checkout, then running git read-tree -mu HEAD.

Though not ideal, working with massive repos is often inevitable, and it may not always be possible to break them down into smaller repos for political and/or convenience reasons. If I find I work on a large enough portion of the repo that sparse checkout doesn't bring performance within a reasonable range, I'll clone multiple copies of the repo with more aggressive sparse checkout settings, a separate copy for each "context" I'm working in.

Additionally, it helps to run git gc --aggressive periodically; be sure to run it after making significant changes to sparse checkout settings.

I'm excited for the filesystem IO issues in WSL to be worked out, as it will make windows a significantly more pleasant development environment. It seems to be a likelier outcome than Linux desktop supporting various Windows/Mac -only software I need, not to mention more affordable than switching to mac hardware.

ckuai commented 6 years ago

Thanks everyone. Was facing this exact problem. the git.exe solution here is really good and give me back my performance. But getting ssh works with git was troubling me. so my solution is with putty's pageant.

#!/bin/bash
# WSL 'git' wrapper, save as /usr/local/bin/git and chmod a+x

REALPATH=`readlink -f ${PWD}`
ARG1=$1

if [[ ${REALPATH:0:5} = "/c/"* ]] || [[ ${REALPATH:0:5} = "/d/"* ]] ; then
  export GIT_SSH="/path/to/plink.exe"
  export WSLENV="GIT_SSH/p"
  if [ $ARG1 == "log" ]; then
    git.exe "$@" | more
  else
    git.exe "$@"
  fi
else
  /usr/bin/git "$@"
fi
bitcrazed commented 6 years ago

For those who're interested in WSL's disk IO perf challenges, you may be interested in @SvenGroot's awesome writeup over in the main issue tracking disk perf issues (#873)

ezillinger commented 6 years ago

Has anyone else noticed a huge regression in speed since the October Update? Linux git was borderline tolerably slow before the update but now it's unbearable. It could absolutely be a problem on my end, especially since I didn't see anything about WSL in the changelog.

dismay commented 5 years ago

Here is my solution using ssh-agent instead of plink to solve the issiue with existing ssh keys for wsltty/mintty terminal

Add the following to ~/.bashrc, do not forget to replace <YOUR USER>:

mkdir -p /mnt/c/ssh-agent
export SSH_SOCK_FILE="/mnt/c/ssh-agent/ssh-agent.sock"
SSH_AGENT_PID=$('/mnt/c/Program Files/Git/usr/bin/ps.exe' -f ssh-agent | grep agent | sed -r 's/^\S+\s+(\S+).*$/\1/');
if [ -z "$SSH_AGENT_PID" ]; then
  rm ${SSH_SOCK_FILE}
  eval $('/mnt/c/Program Files/Git/usr/bin/ssh-agent.exe' -s -a ${SSH_SOCK_FILE/\/mnt/}) &>/dev/null
  export SSH_AUTH_SOCK=${SSH_SOCK_FILE}
  export WSLENV=SSH_AUTH_SOCK/p
  '/mnt/c/Program Files/Git/usr/bin/ssh-add.exe' /c/Users/<YOUR USER>/.ssh/id_rsa &>/dev/null
fi
export SSH_AUTH_SOCK=${SSH_SOCK_FILE}
export WSLENV=SSH_AUTH_SOCK/p

And of course use git wrapper mentioned above

mkarpoff commented 5 years ago

Did something change in the last few days? My performance on mounted drives for file io commands like git went from the expect slow to almost as fast when they are run in the WSL directory.

MichaelTong commented 5 years ago

Did something change in the last few days? My performance on mounted drives for file io commands like git went from the expect slow to almost as fast when they are run in the WSL directory.

@mkarpoff That's interesting. Are you on insider's build? I'd like to try out too.

mkarpoff commented 5 years ago

I'm on slow ring build 17763.253. I think it might have to do with windows defender getting updated because I got some updates to it.

MichaelTong commented 5 years ago

@mkarpoff I'm on the same build as yours. My feelings are it's much faster than what it used to be, but still not as fast as on native linux.

mkarpoff commented 5 years ago

So I did some VERY BASIC benchmarking with a git repo. I cloned the same repo on 4 different machines. My laptop (windows +WSL ubuntu 18.04), a university server(ubuntu 16.04), a university lab computer (ubuntu 16.04), and my own personal server from cybera (ubuntu 18.04). I don't have the specs for allot of these machines but I know the hardware is very different so take all of this with some salt. I cloned the repo onto all the machines in the windows case I cloned it into my "Documents" folder as well as in the WSL home folder. I then ran time git status on each of them 10 times.

WSL Home: 0.032 s WSL Documents: 0.042 s University Server: 0.012 s University Lab Machine: 0.011 s Cybera Server: 0.013 s

This does show that WSL it is still slow relative to native linux this command used to take around 0.100s - 0.200s in the Documents directory for me so I'll call that a huge win.

I should also mention that this is win Windows Defender and OneDrive running and OneDrive is set to backup the "Documents" directory.

MichaelTong commented 5 years ago

@mkarpoff how large is your project?

mkarpoff commented 5 years ago

It's very small only 300 KB

therealkenc commented 5 years ago

Consolidate #873

EL-shadow commented 4 years ago

Using the git.exe from WSL has greatly accelerated the work. But I was faced with the problem that the githooks that worked in the WSL Ubuntu git did not work with git.exe [GIT-HOOKS WARNING] Non-executable ... is skipped

intangir commented 2 years ago

Just ran into this on a brand new WSL2/ubuntu setup with the latest builds and versions, so painfully slow I managed to search and find this

as an FYI i was using the exact same working directory setup before on WSL1 on an older windows10 build with linux git working on files on a /mnt/c/ directory and i never noticed any issues whatsoever..