gitless-vcs / gitless

A simple version control system built on top of Git
https://gitless.com
MIT License
1.92k stars 107 forks source link

gl switch taking forever #134

Open BarrySmith opened 7 years ago

BarrySmith commented 7 years ago

I downloaded a fairly large, but not absurdly large, repository, bitbucket.com/petsc/petsc then tried to switch to the next branch with

gl switch next

it has been hanging for several minutes with the activity manager showing 70+% of CPU in git. This is on a Mac with the lastest OS and gitless installed with brew.

Is there a known scalability issue with gl switch for largest repositories, and a plan to fix it? Currently this makes gl worthless for me.

chmike commented 7 years ago

Sorry, I can't help. I suspect it depends how many files there are in the repository. It shouldn't depend on the size of the history (nbr commits).

If switching branch with git is much faster than gitless, the cause should be in the gitless specific handling. One thing that gitless does and git doesn't is to save also untracked files in the stash. Maybe it's this operation that is inefficient.

One way to test it, I guess, is to use the --move-over (-mo) option with the gl switch command. It tells gitless to not stash untracked and modified files. This way you find them in your workspace with the branch set to the target branch. I would test this on a copy of the directory to avoid corrupting your workspace.

Since it would move modified files, it doesn't work exactly like a git checkout that would override the modified files, or block and warn you about that. So speed depends on how this switch is implemented. This simple test with -mo will give us more info where the problem could be.

gitless is implemented in python. This is perfect for a proof of concept. But I don't think it's good for big projects.

spderosso commented 7 years ago

You can use the mo flag to move over changes to the destination branch as a temporary workaround. That won't give you independent branching but at least it won't hang forever. The problem is that internally we use git stash to save all uncommitted changes of a branch and that takes a lot of time when there are lots of files. More info here: #20

We are looking into doing this differently but I don't know when I'll get the chance to work on it (but hopefully sometime soon!)

pakdev commented 7 years ago

Does git stash account for the slow down when both branches don't have any uncommitted files? Here's what I see when I execute the following sequence of commands:

Command Approx. Execution Time (s)
git switch dev 25
git switch master -mo 25
git switch dev -mo 1
git switch master -mo 1

I assume the switch back to master with the --move-over flag takes just as long because it has to unstash?

erikness commented 6 years ago

gl switch is hanging for me and I bet it's because it tries to stash all the ignored files as well. When gitless stashes changes on a gl switch, it uses the --all flag which stashes everything: tracked, untracked, and ignored.

https://github.com/sdg-mit/gitless/blob/6a785318f47e9e054bfe51d32441fae7671124da/gitless/core.py#L331

Testing

I have 3 branches - master, dev, and dev-copy (which is identical to dev). This is a large repository with a lot of ignored build files, but none of the branches have any tracked or untracked files from gitless's perspective.

After cleaning everything out with git clean -xdf:

Command Wall Clock Execution Time (s)
(from master) gl switch dev 8
gl switch dev-copy 7
gl switch master 9

Not as quick as git checkout but manageable (not sure why mine is noticably faster than @pakdev's) . These commands wouldn't finish at all without running git clean.

Here's what I get when my repository is not cleaned but I include the --move-over flag (again, no tracked or untracked changes, just the ignored build files), which skips the stash step entirely:

Command Wall Clock Execution Time (s)
(from master) gl switch --move-over dev 5
gl switch -move-over dev-copy 5
gl switch --move-over master 5

Since I had no tracked or untracked changes when running this, the reason that these took 5 seconds instead of forever had to be because gitless was stashing ignored files.

So why does gitless need to stash the ignored files on a gl switch, instead of just the tracked and untracked files? There's an alternative flag to git stash, which is --include-untracked, which stashes (from git's perspective) changes to tracked files (both staged and unstaged) and untracked files but does not stash ignored files.

zph commented 5 years ago

I'm interested to try out gitless as the tool to handle modified files or added files when switching branches. But when trying it in a repository with many compiled/downloaded assets that are git ignored, it's too slow :(.

I followed the recommendation from @erikness and packaged that up as a homebrew formula that patches this project at install time to use --include-untracked instead of --all.

It can be installed via:

brew install https://gist.githubusercontent.com/zph/7d3953208d60c267ce4f96cfda0e7efb/raw/20dcf6901908fd1d2da2858be418ce7b29fa0fa3/gitless_with_untracked.rb

It improves the branch swap time from 30 sec (at 30s it failed out on node_module line carriage return git errors) to 1s.