odeke-em / vim

Automatically exported from code.google.com/p/vim
0 stars 0 forks source link

Expensive (non-limited) patterns with syntax/diff.vim #309

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
TEST CASE:
1. wget 
https://gist.githubusercontent.com/blueyed/b098ff662b2963621ec0/raw/a6f5a88f43c3
f55a48bf89ddcad08c86ed2e67c9/test.diff
2. vim -u NONE -N test.diff
3. :syntax on

There are a lot of expensive / non-limited regexp patterns in syntax/diff.vim,
mostly for the translations, e.g.:

    " he
    syn match diffOnly  "^.*-ב קר אצמנ .*"

With long diff lines, e.g. from a minified javascript or json file, this
becomes very expensive and Vim takes a very long time to apply the syntax
highlighting.

Patterns that start with `^.*` should probably get changed to not match what
the generic matches are meant to match (e.g. diffRemoved).
This might require to use syntax regions / contained or something similar?
(because all `syn match` statements are being applied (IIRC), and not just the
first/last one)

For starters, something like `^[^+-]` could be used, but then it would not
handle filenames starting with the characters.

The absolutely worst behaving pattern is the following, and might indicate a
problem with the regexp engine, which should optimize it according to the
anchoring at the end:

    " he
    syn match diffIsA   "^.* .*-ל .* .* תוושהל ןתינ אל$"

Apart from that, I think it might be a good idea to just use "LANGUAGE=C diff"
(or something similar to get the non-translated messages) to get the diff.

Original issue reported on code.google.com by dhahler@gmail.com on 6 Jan 2015 at 7:01

GoogleCodeExporter commented 9 years ago
> Apart from that, I think it might be a good idea to just use "LANGUAGE=C diff"
(or something similar to get the non-translated messages) to get the diff.

Scratch that.

Original comment by dhahler@gmail.com on 6 Jan 2015 at 7:18

GoogleCodeExporter commented 9 years ago
Try the latest runtime update 
(https://code.google.com/p/vim/source/detail?r=11d78e58a487471e13ecb5223b75249cd
7e949d5) I think, that provides a fix for matching translates diff files. (see 
:h diff.vim)

Original comment by chrisbr...@googlemail.com on 8 Jan 2015 at 11:23

GoogleCodeExporter commented 9 years ago
please confirm, that the runtime file update fixes the issue.

Original comment by chrisbr...@googlemail.com on 8 Jan 2015 at 11:23

GoogleCodeExporter commented 9 years ago
Thanks. It provides a good default, but does not fix the real issue, especially 
given that it's still enabled by default.

Could the regexp engine get improved to consider the anchoring at the end ("$")?

For reference: the patch is 
https://code.google.com/p/vim/source/diff?spec=svn11d78e58a487471e13ecb5223b7524
9cd7e949d5&r=11d78e58a487471e13ecb5223b75249cd7e949d5&format=side&path=/runtime/
syntax/diff.vim.

Original comment by dhahler@gmail.com on 9 Jan 2015 at 10:51

GoogleCodeExporter commented 9 years ago
I checked again. While I do not think, it is feasible to cache the win_line() 
function (that function is huge and one would have to cache many options as the 
line drawing depends on them and this will likely introduce regressions) I 
found that especially the hebrew localization makes Vim slow. Vim already skips 
back to the old engine automatically because of the slow regular expressions 
for the hebrew localization.

So here is a patch, documenting the g:diff_localization issue (which isn't yet, 
I think) and also applies the hebrew syntax highlighting only if some hebrew 
chars are actually found. Using that patch, this makes Vim at least react 
(though the other localization still make Vim very slow as seen by :syntime 
report but you disableing the localization entirely makes Vim then usable again

Original comment by chrisbr...@googlemail.com on 27 Jan 2015 at 7:06

Attachments:

GoogleCodeExporter commented 9 years ago
I have included Christian's change.  Please check once the runtime files have 
been updated.  I'll close this issue for now.

Original comment by brammool...@gmail.com on 3 Feb 2015 at 5:47