Misleading benchmark in README

jalil-salame commented 8 months ago

This benchmark image is misleading:

README benchmark image

The patterns used are not the same:

fzf-native uses main.c$
fzy-native uses main.c
fzf-lua uses main.c

Notice the extra $ at the end in fzf-native. This causes fzf-native to only find 39 matches compared to the 44215. It is not clear that the fzf-native search is more specific and thus gives misleading information about the benchmark results.

Proposed solution

Add a note like:

[!Note] Here fzf-native is much faster because we can use a narrower match thanks to fzf-native's syntax:

README benchmark image

Other solutions that were considered

Re-run the benchmark using main.c for all the queries
Re-run the benchmark using main.c$ for all queries (I couldn't find out if fzy-native supports regex matches though)
Delete the benchmark from the repo (there is already one benchmark, no need for more c:)

Conni2461 commented 8 months ago

I couldn't find out if fzy-native supports regex matches though

there is no way to do this with fzy, thats kinda the point if you need to find things main.c doesnt get you anywhere with fzy but with fzf you can just append a $ and you end up with 39 results in ms. Thats why i wanna keep that benchmark! But i will redo it.

i will be fair and show main.c for all queries but i keep the main.c$ with an additional note. And i will also add the benchmark code, because i have to redo it from scratch, i no longer have the input file with the 180k lines so i redo it with a 600k input file. or something like that.

i always knew that this test was misleading but yeah, it is what it is

jalil-salame commented 8 months ago

I couldn't find out if fzy-native supports regex matches though

there is no way to do this with fzy, thats kinda the point if you need to find things main.c doesnt get you anywhere with fzy but with fzf you can just append a $ and you end up with 39 results in ms. Thats why i wanna keep that benchmark! But i will redo it.

Then just add the note c: The $ is easy to miss in the wall of text, and it is hard to find out if fzy supports the syntax.

A better wording of the note would be:

[!Note] The following test uses fzf's syntax to get a narrower match, this makes us 11x faster than fzy :tada: by not doing as much work (we return 39 matches instead of 44215).

This way you can skip redoing the benchmarks.

On the other hand, being able to reproduce the benchmarks would be nice c: And if you do redo them, then I suggest comparing:

fzf-native main.c$ (narrower match, only supported by fzf-native :tada:)
fzf-native main.c (broader match)
fzy-native main.c
fzy-lua main.c

This keeps the benchmarks clear c:

Conni2461 commented 8 months ago

okay i need to write this down better for documentation but this is the script:

local path = require "plenary.path"
local bench = require "plenary.benchmark"
local fzf = require "fzf_lib"
local fzy_lua = require "telescope.algos.fzy"
local fzy_native = require "fzy-lua-native" # romgrk/fzy-lua-native

local files = vim.split(path:new("files"):read(), "\n")

print("Benchmark Setup: 5 warmup and 25 runs each")
bench("fuzzy sorting", {
  warmup = 5,
  runs = 25,
  fun = {
    {
      "fzf-native pattern: 'f' results: '414451'",
      function()
        local out = {}

        local slab = fzf.allocate_slab()
        local p = fzf.parse_pattern("f", 0)

        for _, f in ipairs(files) do
          local score = fzf.get_score(f, p, slab)
          if score > 0 then
            table.insert(out, { line = f, score = score })
          end
        end

        fzf.free_pattern(p)
        fzf.free_slab(slab)
        return #out
      end,
    },
    {
      "fzy-native pattern: 'f' results: '414451'",
      function()
        local out = {}

        local needle = "f"
        for _, f in ipairs(files) do
          if fzy_native.has_match(needle, f) then
            local _, score = fzy_native.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end

        return #out
      end,
    },
    {
      "fzy-lua pattern: 'f' results: '414451'",
      function()
        local out = {}

        local needle = "f"
        for _, f in ipairs(files) do
          if fzy_lua.has_match(needle, f) then
            local _, score = fzy_lua.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end

        return #out
      end,
    },
  },
})

bench("fuzzy sorting", {
  warmup = 5,
  runs = 25,
  fun = {
    {
      "fzf-native pattern: 'main.c' results: '61539'",
      function()
        local out = {}

        local slab = fzf.allocate_slab()
        local p = fzf.parse_pattern("main.c", 0)

        for _, f in ipairs(files) do
          local score = fzf.get_score(f, p, slab)
          if score > 0 then
            table.insert(out, { line = f, score = score })
          end
        end

        fzf.free_pattern(p)
        fzf.free_slab(slab)
        return #out
      end,
    },
    {
      "fzf-native pattern: 'main.c$' results: '66'",
      function()
        local out = {}

        local slab = fzf.allocate_slab()
        local p = fzf.parse_pattern("main.c$", 0)

        for _, f in ipairs(files) do
          local score = fzf.get_score(f, p, slab)
          if score > 0 then
            table.insert(out, { line = f, score = score })
          end
        end

        fzf.free_pattern(p)
        fzf.free_slab(slab)
        return #out
      end,
    },
    {
      "fzy-native pattern: 'main.c' results: '61539'",
      function()
        local out = {}

        local needle = "main.c"
        for _, f in ipairs(files) do
          if fzy_native.has_match(needle, f) then
            local _, score = fzy_native.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end
        return #out
      end,
    },
    {
      "fzy-lua pattern: 'main.c' results: '61539'",
      function()
        local out = {}

        local needle = "main.c"
        for _, f in ipairs(files) do
          if fzy_lua.has_match(needle, f) then
            local _, score = fzy_lua.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end
        return #out
      end,
    },
  },
})

and these are the results

Benchmark Setup: 5 warmup and 25 runs each
Benchmark Group: 'fuzzy sorting' -----------------------
Benchmark #1: 'fzf-native pattern: 'f' results: '414451''
  Time(mean ± σ):    118.8 ms ±  34.7 ms
  Range(min … max):   88.5 ms … 204.4 ms  25 runs
Benchmark #2: 'fzy-native pattern: 'f' results: '414451''
  Time(mean ± σ):    363.0 ms ±  31.5 ms
  Range(min … max):  280.0 ms … 407.7 ms  25 runs
Benchmark #3: 'fzy-lua pattern: 'f' results: '414451''
  Time(mean ± σ):    4.7 s ±  54.7 ms
  Range(min … max):  4.7 s … 4.9 s  25 runs
Summary
  'fzf-native pattern: 'f' results: '414451'' ran
  3.1 ± 0.9 times faster than 'fzy-native pattern: 'f' results: '414451''
  39.9 ± 11.7 times faster than 'fzy-lua pattern: 'f' results: '414451''

Benchmark Group: 'fuzzy sorting' -----------------------
Benchmark #1: 'fzf-native pattern: 'main.c' results: '61539''
  Time(mean ± σ):    102.5 ms ±  17.6 ms
  Range(min … max):   91.6 ms … 163.1 ms  25 runs
Benchmark #2: 'fzf-native pattern: 'main.c$' results: '66''
  Time(mean ± σ):      7.7 ms ± 272.1 μs
  Range(min … max):    7.4 ms …   8.1 ms  25 runs
Benchmark #3: 'fzy-native pattern: 'main.c' results: '61539''
  Time(mean ± σ):    108.1 ms ±  27.8 ms
  Range(min … max):   90.7 ms … 162.3 ms  25 runs
Benchmark #4: 'fzy-lua pattern: 'main.c' results: '61539''
  Time(mean ± σ):    1.7 s ±  27.5 ms
  Range(min … max):  1.7 s … 1.8 s  25 runs
Summary
  'fzf-native pattern: 'main.c$' results: '66'' ran
  13.4 ± 2.3 times faster than 'fzf-native pattern: 'main.c' results: '61539''
  14.1 ± 3.7 times faster than 'fzy-native pattern: 'main.c' results: '61539''
  227.7 ± 8.8 times faster than 'fzy-lua pattern: 'main.c' results: '61539''

nvim-telescope / telescope-fzf-native.nvim

Misleading benchmark in README #123