pref: ffi performance improvements

Conni2461 commented 3 years ago

fzy comparison:

kkharji commented 3 years ago

😍😍😍😍😍😍

vheon commented 3 years ago

@Conni2461 how did you get the performance data posted in this PR? I wanted to try to tweak something and it would be useful.

Conni2461 commented 3 years ago

First of all if you want to only test the c code you can do make benchmark. It uses libcurl to download nlohmann/json.hpp and does multiple score calculations over it.

The images here are based on a slightly modified benchmark script that can be found in fzy-native: https://github.com/romgrk/fzy-lua-native/blob/master/benchmark.lua So i can compare it easily.

local fzf = require("fzf_lib")

local function lines_from(file)
  local lines = {}
  local i = 1
  for line in io.lines(file) do
    lines[i] = line
    i = i + 1
  end
  return lines
end

local function benchmark(fn)
  local total = 0
  -- Warmup
  for _ = 1, 5 do
    total = total + #fn()
  end

  local start = os.clock()
  total = 0
  for _ = 1, 1 do
    total = total + #fn()
  end
  local end_ = os.clock()
  return (end_ - start) * 1000, total
end

local function filter(prompt, lines)
  local results = {}

  local slab = fzf.allocate_slab()
  local p = fzf.parse_pattern(prompt, 0)
  for _, line in ipairs(lines) do
    local score = fzf.get_score(line, p, slab)
    if score > 0 then
      local positions = fzf.get_pos(line, p, slab)
      table.insert(results, { line, positions, score })
    end
  end
  fzf.free_pattern(p)
  fzf.free_slab(slab)
  return results
end

local lines = lines_from('./files')

print('fzf-native lua performance:')
print('Lines: ' .. #lines)
print('')

local time, total = benchmark(function() return filter('f', lines) end)
print('Native:')
print(' -> Total: ' .. total)
print(' -> Time:  ' .. time .. ' ms')

and i did this in home dir

fd -I --hidden > files

Note: This does not call the telescope sorter interface, it calls the bindings below (which are used in the telescope sorter interface) Note 2: The script also gets the positions for all files that have a score > 0. This would not happen in telescope. There we only get the positions for ~50 files. So if you remove the fzf.get_pos call it takes about 40ms to get scores for all 240000 files.

If you can tweak stuff that would be super cool. I am not super familiar with c so every improvement is very welcome ;) And feel free to ask more questions :)

vheon commented 3 years ago

I am not super familiar with c

Me neither I mainly work in C++ so a lot of the constructs I'm used to are not applicable 😜

And feel free to ask more questions :)

You bet! 👍 just a quick one I'm seeing the v1 and v2 implementation but it seems only the v2 is used... why keep the v1 around? I'm not sure if this PR is the right place to ask questions 😝

Conni2461 commented 3 years ago

You can also open a question issue or so. I don't really care that much about it :rofl: Or gitter private message

You bet! just a quick one I'm seeing the v1 and v2 implementation but it seems only the v2 is used... why keep the v1 around? I'm not sure if this PR is the right place to ask questions

Because fzf also has both :rofl: Real answer: because v1 is faster in certain situations. Basically if we have a slab (preallocated block of memory) and if M * N > this block its slower to allocate M * N than using the v1 version.

https://github.com/nvim-telescope/telescope-fzf-native.nvim/blob/64efff7b8b552fddfb6e6e66f78bed0159835cf1/src/fzf.c#L520-L523

If we don't have a slab we already accept that we are allocating memory so in that case we just continue to use v2. Thats how junegunn handles it as well.

nvim-telescope / telescope-fzf-native.nvim

pref: ffi performance improvements #19