Closed paulmillr closed 11 years ago
'Freaking slow' doesn't help me make it faster. vcs_info
is even larger.
Right, but I do not think profiling will help much. It just seems logical. When your machine constantly does IO and you have HDD instead of SSD, doing one IO command will be faster than doing 5 (10?).
also you seem made a typo in this guy's name.
I do not have a SSD, and, for me, it's fast enough. You can execute git-info off
for very large repositories.
I'm going to invite @ColinHebert into this conversation.
Try it on a removable usb drive. I reckon that would make a difference.
A solution is to use timeout
. But of course it can only be used on one command, not the entire function. Or it maybe possible to create a timeout
function to set running-time limit on a shell function. Something like this (link to my dotfile repo, implementation doesn't really work on zsh).
How does git-info
compare to vcs_info
? Do you consider that git-info
does slightly more than vcs_info
in some cases.
Hard to explain, but it takes a second for the prompt to display compared to using vcs_info
I wonder if vcs_info is caching. @benohara git-info
is not that complicated. Don't be afraid to look it over.
For @benohara, If I had to guess, I would say that it's due to git submodules, it tends to slow down git status
which is the biggest call made in git-info
.
You can try that theory with zstyle ':omz:module:git:ignore' submodule 'all'
it should skip git status
on your submodules and go way faster.
Regarding the speed of git-info
itself I think it could be slightly improved by checking if the appropriate zstyle is set (zstyle -t context style [ strings ...]
does that I believe).
For example, no need to run git symbolic-ref -q HEAD
if you don't get the branch name or git rev-parse --symbolic-full-name --verify HEAD@{upstream}
if you display nothing when the local branch isn't synchronised with the remote branch.
Those are minor improvements but should be make things slightly faster (I think).
And to answer the initial question, should we have a git-info
light; I think that if you need a lighter git-info
(ie. with less features) you should consider using vcs_info
.
git-info
is, as far as I am know, not here to be lighter or faster than vcs_info
, but to add more features that you couldn't do in vcs_info
without writing a terrible code (because there is not enough hooks for everything) or a slower one.
I have nothing personal against vcs_info
and it works very well if you want to keep you shell simple (and you can use the same style for every VCS). The reason why I don't use it is because I work all day long with git, and I need more info than just "is the local repo dirty" and the current branch name; I think I use 90% of what is executed in git-info
.
I have looked at vcs_info
, more specifically VCS_INFO_get_data_git. It does not cache. It does not do anything clever to be faster. It uses git diff-index to get a few things, namely staged, non-staged, and commit in addition to the branch and action.
git-status
is slow. There is no way around it as far as I know. If you do not need a lot of information, use a theme that uses vcs_info
. If you want a lot of information, use a theme that uses git-info
.
As @ColinHebert said, any changes that can be done to git-info
are trivial and are not likely to provided a perceptible increase in speed.
For what it's worth, I have a git-info-fast
in my branch. It doesn't do some of the remote lookups and runs much faster (measured subjectively of course.) than the existing git-info
.
I've been meaning to do a proper pull request (for this and other things) and now that the repo's been split, hopefully that can happen soon.
If anyone's interested, it's sitting here for now.
What does git-info-fast
do that couldn't be available (easily) with vcs_info
?
I mean, as said in this discussion the difference between git-info
and vcs_info
is that one provides more information while the other has the compatibility system with every (or most of them) VCS.
It seems to me (correct me if I'm wrong) that git-info-fast
provides the same amount of details as vcs_info
while not being compatible with other VCS systems like git-info
.
Not that what you did isn't efficient or useful, but how does this solution compare to the two existing solutions already used by OMZ users?
@pbrisbin I am not merging that. Other than what @ColinHebert said, it's broken, especially the way it checks if you are inside of a repository.
I am open to making git-info
faster without removing functionality, perhaps by using different low level git
executables, such as git-diff-index
.
Is there a Git daemon that uses inotify
(Linux), FSEvents
(Mac OS X), kqueue
(Mac OS X, BSD), ReadDirectoryChangesW
(Windows) to always be up to date on work tree changes in order for git status
to run instantly by not having to walk said tree?
Should we cache git status
then use a directory change notification library to update the changed file counts for added, modified, removed, renamed, and so on?
@sorin-ionescu awesome idea, :+1:
@paulmillr Someone else has had the same idea: inotify daemon speedup for git. Unfortunately, it was not successful.
So, who wants to extend kqwait to attempt caching + file system notifications? You cannot expect me to everything?
I'm not really fond of having that inside Prezto. If anything was done I would prefer to see an extension of git itself speeding up the git status
, I'm not sure I'm comfortable with having my shell spawning daemons in every git repo I own and caching things weirdly.
I would be all for a new project to replace git status
or enhance it. (As I'm trying to play with ruby on my weekends I might try that actually, but it's for fun, don't expect anything)
You can't do it in Ruby. It's low level kernel stuff. You'll have to do it in C.
You could write 95% of it in Ruby on top of 5% of C bindings. Some already exist for inotify and libgit2.
Personally, I think there's a whole in the market for a "git-prompt" which efficiently gives prompt-friendly (and formattable) status output.
Daemon-watcher-caching sounds useful, but a secondary concern to me. My git prompt's plenty fast so long as I'm on an SSD (which will soon be the norm) and I take out the calls that needed network.
I also agree -- as the requirements for this grow, you've moved well out of shellrc territory.
On Tue, Oct 2, 2012 at 4:46 PM, Sorin Ionescu notifications@github.comwrote:
You can't do it in Ruby. It's low level kernel stuff. You'll have to do it in C.
— Reply to this email directly or view it on GitHubhttps://github.com/sorin-ionescu/prezto/issues/221#issuecomment-9086281.
So, I've been toying with trying to make git-info
faster. I've done a lot of changes on this issue's branch.
Besides the boat load of if statements to test if a zstyle has been defined, it now also lets you choose between classic git-info
status (full) and vcs_info
status (partial), which only shows indexed (staged), via format code %i
, and unindexed (unstaged), via format code %I
.
zstyle ':prezto:module:git:info' status 'partial'
zstyle ':prezto:module:git:info:branch' format ':%F{green}%b%f'
zstyle ':prezto:module:git:info:indexed' format ' %B%F{green}i%f%b'
zstyle ':prezto:module:git:info:unindexed' format ' %B%F{blue}I%f%b'
zstyle ':prezto:module:git:info:keys' format \
'prompt' ' %F{blue}git%b' \
'rprompt' '%i%I'
Please test this new git-info
for speed and bugs.
# Switch to git-info theme.
time (git-info)
# Switch to vcs_info theme.
time (vcs_info)
@sorin-ionescu I don't intend to do any low level stuff there is already plenty of tools to use inotify and FSEvent. Worse case scenario I would have to do some ruby ffi (I would very much like to avoid that anyway).
Plus it would be easier to move to C if a POC can be setup quickly, there IMHO is only perl, python and ruby as viable languages for this POC, there is no way I do that in perl, so I'll try with ruby.
@pbrisbin I think it will still be useful when you work with a lot of submodules (which is my case, about 100 submodules in my main project at work)
@sorin-ionescu heh, applying ifs to check if the zstyle is used rings a bell. But anyway, I think our main problem is (and will stay for a while) this git status
which is incredibly slow (at least that's what bothers me the most).
@ColinHebert Well, with %i
and %I
, you can now have vcs_info
status, including its deficiency of not detecting untracked files. The new boat load of if statements, we should probably keep. The vcs_info
style status, I'm not too sure.
Benchmarking it against vcs_info
themes would be useful.
The new git-info
is slightly faster.
Old:
0.04s user 0.09s system 85% cpu 0.153 total
New (status enabled):
0.04s user 0.08s system 87% cpu 0.138 total
New (status not enabled):
0.02s user 0.05s system 87% cpu 0.085 total
I've toyed with a peepcode
theme clone called peepcode_git_info
that uses git-info
.
peepcode (vcs_info):
0.04s user 0.07s system 87% cpu 0.124 total
peepcode_git_info (git-info):
0.03s user 0.06s system 86% cpu 0.104 total
It's probably faster because git-info does not have stgit support.
The git-info
version is a lot more readable than the vcs_info
version.
Comments?
@ColinHebert How does multiple calls to git ls-files
compare to one call to git status --porcelain
, I wonder?
Hum, I'm not so sure about ls-files
it's really recommended to stay away from it (for scripting). If we want to go with plumbing commands, we should take a look at git diff-index
and git diff-files
.
I did a really quick test, here is what we would like to have:
added (to the WD/untracked) :
git ls-files -o --exclude-standard
added (to the index):
git diff-index HEAD --name-status --cached (--find-renames)
removed (from the WD):
git diff-files --name-status
removed (from the index):
git diff-index HEAD --name-status --cached (--find-renames)
modified (in the WD):
git diff-files --name-status
modified (in the index):
git diff-index HEAD --name-status --cached (--find-renames)
renamed (in the WD): NOT RELEVANT
renamed (in the index):
git diff-index HEAD --name-status --cached --find-renames
I haven't checked the unmerged yet. And there is a big problem with all of that, almost all of those commands require HEAD
which doesn't exist until the initial commit is done.
Overall I think that we should stick with git status
which already does the aggregation we're about to do. I'm not sure that doing that ourselves will give better results.
Has anybody bothered to test these changes for speed and bugginess?
I am inviting @skpw into this conversation.
I have made git-info
faster by only computing information when a particular zstyle
is defined. However, since git-status
is slow and many do not want as much repository information as my theme shows, I have also added a mode, simple, in lieu of complex, feel free to suggest better names, that behaves similarly to vcs_info
, which informs of staged and unstaged files, which for the purpose of git-info
, they shall be known as indexed files and unindexed files, the %S
format code is in use for stashed files.
Select the mode you want for your theme:
zstyle ':prezto:module:git:info' status 'simple/complex'
I have come up with two versions of the simple mode, known as v1 and v2, which I shall discuss next.
v1 behaves similarly to vcs_info
, but unlike vcs_info
, unindexed also informs of untracked files because I have noticed that many vcs_info
themes hack support for untracked files using a vcs_info
hook since most people consider both unindexed and untracked as one and the same — not in the index. See the peepcode theme for an example. They can be separated, of coarse; I just chose to follow the hook hack.
The performance between vcs_info
and git-info
is virtually identical provided that the vcs_info
theme also checks for untracked files.
The following format codes are available.
Name | Format Code | Description |
---|---|---|
indexed | %i | Indexed files indicator |
unindexed | %I | Unindexed (including untracked) files indicator |
The deficiency of this version of the simple mode is that these format codes have to be set to a coloured UTF-8 character or word. There is no count of indexed and unindexed files like in other contexts.
v2 behaves similarly to the classic git-info
and calls the same git
porcelain commands as v1 but presents the information computed differently. unindexed no longer mashes together unindexed files and untracked files; they are now split into separate unindexed and untracked contexts. Furthermore, the file count for each context is provided.
This version also transplants two contexts from the complex mode, clean and dirty. Many people just want to know when a repository is dirty by displaying the ✗ character.
So, what is dirty?
dirty = indexed + unindexed + untracked
The above three contexts are initialised to 0 and unless defined in the theme, they are never computed. If dirty to you means unindexed and untracked but not indexed, and you want to show the ✗ character you'll have to define the following:
zstyle ':prezto:module:git:info:unindexed' format ' '
zstyle ':prezto:module:git:info:untracked' format ' '
zstyle ':prezto:module:git:info:dirty' format ' %F{red}✗%f'
The following format codes are available.
Name | Format Code | Description |
---|---|---|
clean | %C | Clean state |
dirty | %D | Dirty files count |
indexed | %i | Indexed files count |
unindexed | %I | Unindexed files count |
untracked | %u | Untracked files count |
v2 is slightly slower than v1 because for indexed and unindexed, we can no longer rely on exit codes and have to count files.
Using time (vcs_info)
and time (git-info)
, I have got the following numbers in a repository with 1 indexed file, 3 unindexed files, and 1 untracked file.
Please vote for or against v1 or v2. You can also suggest your own or none at all. I'm not particularly fond of adding more features to git-info
.
:+1: v2
:+1: v2
Perhaps minimal and verbose are better names for the two modes than simple and complex.
If anybody has got ideas on how to speed it up further, I'm listening. Yes, you'll have to read and comprehend the giant git-info
function.
If all you want to show is a dirty repository indicator, no counts, vcs_info
is still your best bet.
So,
git-info
is currently about 408 lines long, but many themes don't need all stuff it does.Actually I don't mind about everything there, but the reason I created this issue is its speed. It's freaking slow.
How about adding light-git-info that will only do:
git symbolic-ref HEAD 2> /dev/null
(get current branch)?