Wintabs functions called too many times.

jordwalke commented 6 years ago

I opened a file with 1000 lines, and moved the cursor downwards from top to bottom.

I am using Wintabs' statusline mode. I had two splits open, in one split I had two buffers, in the other split I had one buffer.

I also have airline rendering in the tabline so it shouldn't be touching the statusline. I activated airline to use as a comparison since it does update with every cursor movement and wintabs should not (I'm just moving the cursor through a single file one thousand lines long).

I would expect pretty much no work to be done while moving the cursor in wintabs, but the function:FUNCTION wintabs#element#slice() was called 25,000 times, and consumed 7 seconds of work (SELF time). FUNCTION wintabs#element#len() was called 200,000 times and consumed 4 seconds of time while scrolling from top to bottom in the file.

jordwalke commented 6 years ago

I notice that while moving the cursor on the first wintab within a window, wintabs#ui#get_statusline is called many times, whereas moving the cursor on the last wintab in a window, it's typically only called once.

jordwalke commented 6 years ago

I'm thinking that in this part of wintabs:

" set statusline window by window
function! wintabs#ui#set_statusline()
  for window in range(1, winnr('$'))
    call setwinvar(window, '&statusline', '%!wintabs#ui#get_statusline('.window.')')
  endfor
endfunction

It should either be the case that get_statusline is very very fast, it we should instead be preevaluating the statusline string, as opposed to using %!wintabs#ui#get_statusline('.window.')' which will cause it to call the get_statusline on every cursor movement.

zefei commented 6 years ago

Thanks for the analysis @jordwalke! This sounds pretty bad if you are using tabline mode and statusline function is called that many times. I never really dug into the statusline performance issue as I assumed it to be quite complex, but tabline shouldn't have this problem. I'll see what I can do.

jordwalke commented 6 years ago

I made a mistake in my description (now fixed). I'm not using tabline mode. I'm using statusline mode. The statusline is invoked at least once per cursor move, but then the perf of each invocation is very concerning (and sometimes it's called too many times it seems). Some caching would go a long ways.

zefei commented 6 years ago

Hey, I changed the expensive functions (len, render, slice, truncate) to be memoized. Please let me know if perf is better now.

BTW, if you are using tabline as statusline, do you refresh it on cursor move (for row/col numbers)? I used to do this and it's very slow as well.

jordwalke commented 6 years ago

The performance improvements seem to have made a positive difference, thanks! There's still room for improvement, but this is great.

BTW, if you are using tabline as statusline, do you refresh it on cursor move (for row/col numbers)? I used to do this and it's very slow as well.

Yes, I do. But I believe that even after the improvements, the time is dominated by the wintabs statusline computation. I think there's various ways to cause the tabline to rerender, and they may result in different performance profiles. I might be doing something that is good for perf (though I'm sure it could be improved). You can see my branch of airline that renders in the tabline. The end experience (plus wintab on the statusline) is the best I've seen in Vim.

jordwalke commented 6 years ago

If you're curious, here's a profile of scrolling through 900 lines for about twenty seconds.

"Total time" is the time that the function takes including all the functions it calls into (think: high level time it takes for that function to run). "Self time" is the time spent in that function alone (think: these are the "bottlenecks" that can be optimized).

FUNCTIONS SORTED ON TOTAL TIME
count  total (s)   self (s)  function
 2970   3.075502   0.144941  wintabs#ui#get_statusline()
 2970   2.537683   0.864738  <SNR>50_get_bufline()
 2970   0.826113   0.257395  wintabs#refresh_buflist()
 2970   0.418151   0.210715  wintabs#session#save()
11880   0.384570             wintabs#memoize#call()
 1471   0.348422   0.057259  GetSyntasticOk()
 1471   0.306205             <SNR>119_Highlight_Matching_Pair()
 2970   0.299891   0.042618  wintabs#renderers#buffer()
 1471   0.291163   0.023581  SyntasticStatuslineFlag()
 2970   0.257273   0.107341  wintabs#renderers#buf_label()
 2970   0.252677   0.076303  <SNR>50_truncate_line()
 5940   0.252491   0.088051  wintabs#element#len()
 5940   0.241610             wintabs#renderers#buffer_sep()
 1471   0.232496             airline#check_mode()
 2970   0.207436   0.148904  <SNR>47_session_save_window()
10297   0.203916             airline#util#append()
 1471   0.186286             311()
 1471   0.182887             airline#extensions#fugitiveline#bufname()
 8910   0.169836             wintabs#getwinvar()
 1471   0.150086             <SNR>141_wordcount_update()

FUNCTIONS SORTED ON SELF TIME
count  total (s)   self (s)  function
 2970   2.537683   0.864738  <SNR>50_get_bufline()
11880              0.384570  wintabs#memoize#call()
 1471              0.306205  <SNR>119_Highlight_Matching_Pair()
 2970   0.826113   0.257395  wintabs#refresh_buflist()
 5940              0.241610  wintabs#renderers#buffer_sep()
 1471              0.232496  airline#check_mode()
 2970   0.418151   0.210715  wintabs#session#save()
10297              0.203916  airline#util#append()
 1471              0.186286  311()
 1471              0.182887  airline#extensions#fugitiveline#bufname()
 8910              0.169836  wintabs#getwinvar()
 1471              0.150086  <SNR>141_wordcount_update()
 2970              0.149932  wintabs#renderers#bufname()
 2970   0.207436   0.148904  <SNR>47_session_save_window()
 2970   3.075502   0.144941  wintabs#ui#get_statusline()
 2970   0.257273   0.107341  wintabs#renderers#buf_label()
 2970              0.089653  <SNR>49_buflisted()
 5940   0.252491   0.088051  wintabs#element#len()
 1471              0.076849  GetEsyProjectStatus()
 2970   0.252677   0.076303  <SNR>50_truncate_line()

jordwalke commented 6 years ago

Here's the function get_bufline where most of the time is spent.

FUNCTION  <SNR>50_get_bufline()
Called 2970 times
Total time:   2.537683
 Self time:   0.864738

count total (s)  self (s)
2970  0.853428  0.027315  call wintabs#refresh_buflist(a:window)

2970            0.007517  let line = []
2970            0.008660  let active_start = 0
2970            0.007001  let active_end = 0
2970            0.005932  let i = 0
2970  0.081549  0.028709  let buffers = wintabs#getwinvar(a:window, 'wintabs_buflist', [])
2970            0.020255  let active_index = index(buffers, winbufnr(a:window))

5940            0.016526  for buffer in buffers
2970            0.012059    let is_active = i == active_index
2970            0.014054    let is_next_active = i == active_index - 1

2970            0.005582    if i == 0
2970            0.006303      if is_active
2970  0.122006  0.023820        let active_start = wintabs#element#len(line)
2970            0.004441      endif

2970  0.185495  0.059852      let element = g:wintabs_renderers.buffer_sep({'is_leftmost': 1,'is_rightmost': 0,'is_left': active_index >= 0,'is_right': 0,'is_active': is_active,})
2970            0.015431      let element.type = 'sep'
2970            0.014734      call add(line, element)
2970            0.003977    endif

2970  0.369921  0.070030    let element = g:wintabs_renderers.buffer(buffer, {'is_leftmost': 0,'is_rightmost': i == len(buffers) - 1,'is_left': active_index >= 0 && i < active_index,'is_right': active_index >= 0 && i > active_index,'is_active': is_active,})
2970            0.010868    let element.type = 'buffer'
2970            0.009923    let element.number = buffer
2970            0.013403    call add(line, element)

2970            0.006781    if is_next_active
                              let active_start = wintabs#element#len(line)
                            endif

2970  0.184617  0.068650    let element = g:wintabs_renderers.buffer_sep({'is_leftmost': 0,'is_rightmost': i == len(buffers) - 1,'is_left': active_index >= 0 && i < active_index,'is_right': active_index >= 0 && i >= active_index,'is_active': is_active || is_next_active,})
2970            0.009232    let element.type = 'sep'
2970            0.011607    call add(line, element)

2970            0.005879    if is_active
2970  0.177286  0.022981      let active_end = wintabs#element#len(line)
2970            0.004220    endif

2970            0.009582    let i = i + 1
2970            0.005051  endfor

2970            0.013634  return [line, active_start, active_end]

It looks like these two lines are the most expensive part, and they probably have a lot of overlapping computation

2970  0.369921  0.070030    let element = g:wintabs_renderers.buffer(buffer, {'is_leftmost': 0,'is_rightmost': i == len(buffers) - 1,'is_left': active_index >= 0 && i < active_index,'is_right': active_index >= 0 && i > active_index,'is_active': is_active,})
2970  0.184617  0.068650    let element = g:wintabs_renderers.buffer_sep(    {'is_leftmost': 0,'is_rightmost': i == len(buffers) - 1,'is_left': active_index >= 0 && i < active_index,'is_right': active_index >= 0 && i >= active_index,'is_active': is_active || is_next_active,})

zefei commented 6 years ago

Hey thanks for the analysis! I've profiled mine as well, and last time I stopped at get_bufline because it seems to be riddled with external states (wintabs#refresh_buflist and wintabs#renderers#bufname). Now I look at them again, there might be some low hanging fruit on buffer_sep, but others still seem harder to tackle.

When I get time, I'll certainly do more tuning and will let you know if I get some significant gains.

jordwalke commented 6 years ago

Cool, I think it's already a good improvement. Just thought I'd leave my results here in case someone gets bored and wants to make it even faster.

zefei commented 6 years ago

I took another shot at improving get_bufline and was able to find all the external state changes. Now this function is memoized in the last commit, I think perf has got a lot better.

jordwalke commented 6 years ago

fantastic

zefei / vim-wintabs

Wintabs functions called too many times. #25