performance and design comparison to nucleo

matu3ba commented 1 year ago

See announcement https://www.reddit.com/r/rust/comments/165lp5a/announcing_nucleo_a_fast_fuzzy_matcher_library/ and repo https://github.com/helix-editor/nucleo.

jake-stewart commented 1 year ago

The announcement claims

runs the matcher in parallel on a background thread pool and doesn't block the UI thread
allows fully parallel lock-free injection/streaming of inputs into the matcher

these are both already implemented in jfind. i can benchmark it against jfind once a matcher is made using it.

jake-stewart commented 1 year ago

i did not realise it had already been implemented in helix. i'll try do a benchmark now

jake-stewart commented 1 year ago

unfortunately, helix does not use fd for finding files. it has it's own directory browser so i could not replace it like with previous benchmarks. i have compiled the latest helix from github and ran it on my /Applications directory as a very crude performance test.

https://asciinema.org/a/VFVdJ5hyN4wMa5sDlzeZBwtfL

jfind was a lot faster. however, this is a crude benchmark and it would be nice if a standalone fuzzy finder was made using nucleo so it can be tested properly.

jake-stewart commented 1 year ago

in response to helix-editor/nucleo#22, i do not see what is odd about including reading speed in a fuzzy finder benchmark. if your architecture results in a terrible performance due to a reading speed bottleneck, then that is a problem with your architecture.

for the end user, reading speed is a very big deal. they open their fuzzy finder and start searching. nobody sits and waits for the loading to complete. if this were the case, why is there such thing as streaming input?

pascalkuthe commented 1 year ago

in response to helix-editor/nucleo#22, i do not see what is odd about including reading speed in a fuzzy finder benchmark. if your architecture results in a terrible performance due to a reading speed bottleneck, then that is a problem with your architecture.

for the end user, reading speed is a very big deal. they open their fuzzy finder and start searching. nobody sits and waits for the loading to complete. if this were the case, why is there such thing as streaming input?

I found this by accident, if you are going to respond its kind of pointless to do it here since I don't get a notifications for that.

Helix sorts all files alphabetically, which requires a single-threaded transversal (and a lot more processing), fd does not. Fd also doesn't follow symlink but helix does. Logically it takes longer to transverse the FS as a result. Helix even usese the exact same rust library as fd (ignore) we just use a different set of config flags. That is simply because our usecase is usually not opening ~ or / directly so these features were more important to us than transversal speed. I can modify the settings we pass to ignore and waiting for all files to stream in takes about the same time as fd | fzf does (it may be faster it may be slower I didn't really investigate it too much).

You can't just compare two entirely different inputs and claim one tool is faster than the other. It's an appels to orangesc comparison. Comparing tools during streaming is also not a good benchmark in general. The pielpline from the FS to the user is roughly:

creation of inputs
streaming of inputs into the matcher
matching on the inputs

Unless your streaming implementation is complete garbage it's never going to be the bottleneck anyway (your pipe reader could be but still unlikely considering how slow readdir is) so it's also not that interesting to look at. For nucleo injecting an item into the matcher involves three atomic writes and copying the actual pointer. That is just completely irrelevant compared to the time that it takes to transverse the FS.

It is also not a good benchmark because an unsorted FS transversal is not exactly deterministic (so depending on what your OS is doing in the background it can take a varying time for a file to stream in). Instead, it makes much more sense to test how long the actual matching takes.

One way to do that is of course to just micobenchmark the matcher itself. Doing system benchmarks is also interesting (but far more challenging to do right) since it takes the concurrent code into account. But that should be done once all matches are streamed in to remove as many unknown variables as possible. For example, one important variable is how many files you are matching but unless you can time your typing speeds to nanosecond accuracy if you start matching during streaming the number of matched items is likely different. For a crude benchmark the most accurate you can get is to wait for all files to stream in (making certain the number of files is the same) and then pasting a pattern (so your typing speed doesn't have an affect).

It's possible that jfind has a faster matcher. It's also possible that nucleo is faster. I wasn't aware of jfind until you opened that issue so I haven't looked at the matcher or done any comparison. If jfind has a faster matcher I would interested in looking at how I can improve. Although I don't really have time to work on something like that right now. I just found your screencast a very odd benchmark. It's not in any way an objective comparison of any code you or I implemented. It's simply comparing how certain config flags of ignore affect fs transversal speed.

jake-stewart commented 1 year ago

@pascalkuthe

previously when i have benchmarked various fuzzy finders, i ensured that the tests were fair by replacing the fd executable with a script that dumped a file of test input. this made it very deterministic.

i made it very clear that this performance test was crude due to my inability to just replace fd. this was indeed a shit performance test. with that said, jfind was still a lot faster.

you attribute this to slow singled threaded traversal and sorting results prior to matching. you are simply explaining why it is slower.

pascalkuthe commented 1 year ago

you attribute this to slow singled threaded traversal and sorting results prior to matching. you are simply explaining why it is slower.

by that logic you could also claim fzf is faster than by comparing fd| fzf and fd --follow | jfind. You are not comparing the same work so it doesn't make sense to do any comparison of how long it takes to stream a file in, not even a crude one.

jake-stewart commented 1 year ago

by that logic you could also claim fzf is faster than by comparing fd| fzf and fd --follow | jfind. You are not comparing the same work so it doesn't make sense to do any comparison of how long it takes to stream a file in, not even a crude one.

you are exactly right that fd is faster than fd --follow. if fd| fzf scores a lower time than fd --follow | jfind, then it is faster. it would be a faster approach to fuzzy finding. a faster approach still could be fd | jfind.

from what i can tell, nucleo is not a fuzzy finder. nucleo is a fuzzy matching library that can be used to create fuzzy finders or used in other tools for fuzzy matching. there may or may not be performance penalties due to it being a library. the only way to see the end result is to see the matching implemented in a fuzzy finder. as it stands, helix is that fuzzy finder.

so i test jfind against helix and from my very crude test, jfind was faster. you argue that it is unfair because helix has slower reading. there is no way for me to make the reading quicker. i cannot replace it with fd or my own test script. i am forced to use helix's slow reading. as a result, helix fuzzy finding is slower.

i dont know how else to explain it to you.

also i am not trying to undermine helix or nucleo. i hadnt heard of nucleo until this issue was opened and i was asked to compare performance design with it.

edit: @pascalkuthe

jake-stewart / jfind

performance and design comparison to nucleo #19