FredrikNoren / ungit

The easiest way to use git. On any platform. Anywhere.
MIT License
10.44k stars 637 forks source link

Too slow for big repository. #1091

Open lygstate opened 6 years ago

lygstate commented 6 years ago

I suggest use in-memory database for log history:_)

jung-kim commented 6 years ago

yes, ungit does struggle when it is rendering large repos with many branches and commits.

I'm sure in memory caching would help but I would like to do performance check on some of the queries and it's parsing as well.

lygstate commented 6 years ago

may optimize by webasdembly

martinmcclure commented 4 years ago

My company is running into this problem as well, and we may not be able to use Ungit for our main product because of long waits. Speed is OK initially, but scrolling down to the point that Ungit starts to display more history gets very slow (tens of seconds the first scroll, more for subsequent scrolls). Profiling shows that almost all of the time is being taken in _registerListener in the JS-Signals code. At least that's what I think it's telling me -- I haven't got much experience with Javascript. See the attached screen shot. It looks like JS-Signals uses a sorted linear array to store listeners, which could get pretty slow if there are a lot of listeners for the same signal. Any ideas where I should look next? Thanks!

UngitPerformanceScreenshot

martinmcclure commented 4 years ago

I did a little more digging into the performance problem we're seeing. On first opening Ungit on our repository, it registers about 2300 listeners. After a quick scroll to the bottom and back up, it registers a bit over 50000 more listeners. This is rather slow. This is using js-signals, which has O(n squared) or worse performance on listeners added. By hacking the js-signals code to violate sorting by priority and order of addition, and to not check for duplicate listeners, I got Ungit's adding of more visible commits to be an order of magnitude faster.

I didn't look at why there are that many listeners, and whether they are necessary. If not all of those listeners are necessary, reducing the number would also help a lot.

jung-kim commented 4 years ago

Hi, began some testings as this is something that cropped on my end as well.

My current testing approach is running ungit against git's git repo and began debugging. Could you possibly confirm that latency impact you see is about similar with the git's git?

Right now, it is my goal to get ungit functional with git's git repo but if I should aim higher or different problems I need to focus, please let me know.

cheers

jung-kim commented 4 years ago

(my current suspicion is too many obervable objects being created with knockout js and I should trim some there...)

jung-kim commented 4 years ago

Did create ~#1091~#1275 and it should have significant speed up the performance.

I've tests with git's own git repo but @martinmcclure could you possibly try this with your repo and let me know if this is sufficient?

Also, it would help to disable animation by setting isAnimated: false within ~/.ungitrc per config as it would disable animation.

Hirse commented 4 years ago

@jung-kim Do you mean #1275?

jung-kim commented 4 years ago

Yup, #1275 is what i meant... thanks

martinmcclure commented 4 years ago

@jung-kim Thanks for taking a look at this issue.

My current testing approach is running ungit against git's git repo and began debugging. Could you possibly confirm that latency impact you see is about similar with the git's git?

I'm afraid that testing against the git repo is not very representative of our situation. A checkout of git contains about 3,900 files and directories, whereas ours is over 55,000 files and directories. A better test might be the linux kernel repo. It's a bit larger than ours, with 71,000 files and directories in a checkout.

Did create ~#1091~#1275 and it should have significant speed up the performance.

I've tests with git's own git repo but @martinmcclure could you possibly try this with your repo and let me know if this is sufficient?

It does show significant improvement, but not really enough to make it usable.

Before: Open the repository in Ungit, wait until CPU idle: about 3 1/2 minutes. Scroll to bottom, wait until CPU idle: about 11 minutes.

Using the 1091 branch from here: Open the repository in Ungit, wait until CPU idle: about 2:10 Scroll to bottom, wait until CPU idle: 2:54

Also, it would help to disable animation by setting isAnimated: false within ~/.ungitrc per config as it would disable animation.

Disabling animation made no noticeable difference in the timing.

With my signals hack (attached, renamed as .txt) things start to get usable: Open the repository in Ungit, wait until CPU idle: 0:04 Scroll to bottom, wait until CPU idle: 0:09 Not that I think my signals hack is necessarily a viable solution, but it's a clue (and gives me something to work with for now). signals.js.txt

jung-kim commented 4 years ago

Oh wow, this is definitely going to be a challenge and will began targeting "functional" linux kernel repo.

Glad to hear that #1091 does help and will work on finalizing those.

js-signals is fascinating and I would definitely have to read up more but I don't think anybody can argue against fantastic speed boost.
I will definitely read up more and try to test more and thank you for taking the initiative to help

campersau commented 4 years ago

Diff of the changes to signals.js made by @martinmcclure https://github.com/FredrikNoren/ungit/issues/1091#issuecomment-582089571

diff --git a/signals.js b/signals.js
index d9c6ff4..2056287 100644
--- a/signals.js
+++ b/signals.js
@@ -228,8 +228,12 @@
          */
         _registerListener : function (listener, isOnce, listenerContext, priority) {

-            var prevIndex = this._indexOfListener(listener, listenerContext),
-                binding;
+            //console.log("Registering a listener");
+            
+            var prevIndex = -1; //Hack to speed up by allowing duplicates
+            
+           // var prevIndex = this._indexOfListener(listener, listenerContext),
+           //     binding;

             if (prevIndex !== -1) {
                 binding = this._bindings[prevIndex];
@@ -253,10 +257,14 @@
          * @private
          */
         _addBinding : function (binding) {
-            //simplified insertion sort
+            //hack unsorted version; ignoring priorities
             var n = this._bindings.length;
-            do { --n; } while (this._bindings[n] && binding._priority <= this._bindings[n]._priority);
             this._bindings.splice(n + 1, 0, binding);
+            
+            //simplified insertion sort
+            //var n = this._bindings.length;
+            //do { --n; } while (this._bindings[n] && binding._priority <= this._bindings[n]._priority);
+            //this._bindings.splice(n + 1, 0, binding);
         },

         /**

Here is an other way to reduce the listeners by using event delegation: https://knockoutjs.com/documentation/unobtrusive-event-handling.html But that would require a ton of changes.

jung-kim commented 4 years ago

@campersau I think it might be easiest to create separate PR and test.

Would you like to spear head this one?

martinmcclure commented 4 years ago

I've created PR #1279 to make Ungit use a new fork of Signals that relaxes semantics that Ungit appears to not need, but is much faster. This is a more polished version of what I've referred to earlier as "my signals hack."

campersau commented 4 years ago

I have looked at the issue with signals in more depth now and here is what's currently going on in the code: In general we subscribe only in a few places to programEvents which is a singals instance like here https://github.com/FredrikNoren/ungit/blob/5c01beb7a229f816eb0abcbdabb9cab8e7ee2e6a/public/source/main.js#L151-L167 and in there we call onProgramEvent on the child components like here https://github.com/FredrikNoren/ungit/blob/5c01beb7a229f816eb0abcbdabb9cab8e7ee2e6a/components/app/app.js#L79-L87 which limits the listener count to just a few.

Also since none of our ungit components gets disposed (like normal ko components would) the components would never have a chance to unsubscribe from programEvents (or ko.computed).

Currently there is one exception to this, the textdiff component. Every instance of it subscribes to programEvents to update the displayed diffs according to the different diff settings like ignoring whitespace and in repositories with a lot of changed files this can result in a lot of listeners. https://github.com/FredrikNoren/ungit/blob/5c01beb7a229f816eb0abcbdabb9cab8e7ee2e6a/components/textdiff/textdiff.js#L82-L87

The solution proposed in #1279 is making programEvents.add faster but the listeners are still pilling up and never get removed which might lead to memory problems. Which I did get in one of my tests with the linux kernel repro: image

Another solution might be plumbing the onProgramEvent, just like in the other components, to the textdiff which would look like this: https://github.com/FredrikNoren/ungit/compare/master...campersau:onProgramEvent

It is a lot of stupid code though and maybe changing the invalidating diff logic completely like in https://github.com/FredrikNoren/ungit/compare/master...campersau:textdiffsettings might be better, this will add ko listeners but these can be garbage collected since they are not global like programEvents are.

It would be great if others can do some tests with the different proposals and share their results and opinions.

jung-kim commented 4 years ago

It is a lot of stupid code though and maybe changing the invalidating diff logic completely like in master...campersau:textdiffsettings might be better, this will add ko listeners but these can be garbage collected since they are not global like programEvents are.

I'm 100% guilty of this silly code. You are right, in retrospec, it is very silly to '.add()` here. I think we can easily fix this to watch an knockout variable.

Am I correct in saying that signal usages within textdiff is the only current issue with https://github.com/FredrikNoren/ungit/compare/master...campersau:micro-signals ?

I will do my own testing soon.


Out side of signals, doing proper disposals on knockouts variables has been on my todos but sheer volume of the variables make this task daunting.

I do believe doing proper ko variable disposals should be done in separate ticket as I don't think it would have big impact on speed but more for longevity of the process.

martinmcclure commented 4 years ago

@campersau Thanks for the analysis and possible solutions. I've lightly tested your textdiffsettings branch, and it looks better so far (it does not seem to build up memory like my pr #1279 does). Will be following the discussion and testing more as I get time.

melroy89 commented 2 years ago

It's very slow even in small repos. Too bad since I do like ungit, but it's unusable for me.