src-d / go-git

Project has been moved to: https://github.com/go-git/go-git
https://github.com/go-git/go-git
Apache License 2.0
4.9k stars 541 forks source link

Status() is slow with a large number of untracked files #844

Open tych0 opened 6 years ago

tych0 commented 6 years ago

I have a tree with a large number of untracked files, and calling Status() is slow. When I profile it, it seems to be hashing all of the untracked files, which is unnecessary.

rafi commented 5 years ago

Even without untracked files, it takes me about ~12s for a repository with a large supposedly ignored ./node_modules/ directory. Deleting this directory reduced the Status() time to 1.5s, still slow, but much more faster. This means Status() doesn't take .gitignore into account when traversing working-directory.

isacikgoz commented 5 years ago

can confirm this issue, maybe there should be a way to take .gitignore files into account.

dedelala commented 5 years ago

Regardless of untracked files and ignores it is quite slow in large worktrees. I have been testing with https://github.com/mawww/kakoune On a clean shallow clone it's an order of magnitude slower than calling git status --porcelain After running a build that time is roughly doubled again.

jfontan commented 5 years ago

After a quick look there seems to be two main problems:

Probably the biggest speedup could come from only calculating the SHA-1 of the file when needed. This may be when the file is in both trees and has the same size.

andrewrynhard commented 5 years ago

Is this planned for any upcoming release? It makes any tools we built based on go-git very painful to use.

tcolgate commented 5 years ago

This is really killing us too.

dennis-tra commented 4 years ago

I also ran into this issue - it's a pity, because this is such an amazing project, but this issue makes it for my use case almost impossible to use