Performance degradation: tens of thousands of rows become unusable

wickedest / Mergely

Merge and diff documents online

http://www.mergely.com

Other

1.18k stars 228 forks source link

Performance degradation: tens of thousands of rows become unusable #140

Closed Boom1597 closed 1 year ago

Boom1597 commented 3 years ago

I ran into the same problem as issue#83 when I compare more than 40,000 rows of data. I want to know if this is a limitation of mergely, or there are other ways to solve it

wickedest commented 3 years ago

@Boom1597, performance can vary, depending on the number of lines, and the number changes within the lines, and even the CPU/memory available to the browser. There are a few of ways to improve performance, but each degrade the experience a bit. You an check out the docs, and search for "performance", but the options are: viewport, sidebar, and lcs. If setting those options still does not give satisfactory performance, then maybe post an example.

Boom1597 commented 3 years ago

@wickedest ,thanks for answering, I tried those options you mentioned but didn't take effect. here is an example: https://codepen.io/boom1597/pen/gOLOMyK or you can use my file: https://gist.githubusercontent.com/Boom1597/d19a5cb91d11ae5f7c69bbaa5344bad0/raw/0e548bd3a028fc9003cd34ae6c33f97e3b5cc580/gistfile1.txt What puzzles me is that before the result is loaded, the text has appeared on the interface, but in fact it has not been loaded yet. That will confuse my users that they think this is a bug. Thanks again for your answer.

wickedest commented 3 years ago

@Boom1597, thanks for taking the time to create the gist and the codepen. I can reproduce your issue. Tweaking those options had no effect because most of the time is spent in the diff algorithm itself. The input file is 1.1 MB and is comparing against a file with no commonality to it, and is doing the best it can to find the longest common subsequence (LCS). I accept that it's not performing well in this instance, especially when GNU diff takes a fraction of a second. It's been a long time (9 years) since I coded the algorithm. I think there are some performance improvements that can be made but as you can imagine it's fairly technical and difficult to optimize. I'll keep it on my TODO list to research further and find ways to improve the performance.

Boom1597 commented 3 years ago

Thanks for your attention.

kmanikandanmca2008 commented 2 years ago

Is there any update on the performance optimisation front. I have also looking for the fix.

wickedest commented 2 years ago

@kmanikandanmca2008, I started looking into it a few weeks ago. The algorithm I'm using for shortest middle snake is recursive and doesn't perform well. I'm researching alternatives.

github-actions[bot] commented 1 year ago

:tada: This issue has been resolved in version 5.0.0-alpha.1 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

github-actions[bot] commented 1 year ago

:tada: This issue has been resolved in version 5.0.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: