noib3 / nvim-completion

:zap: An async autocompletion framework for Neovim
MIT License
503 stars 13 forks source link

honestly im interested in how this turns out #18

Closed ms-jpq closed 1 year ago

ms-jpq commented 2 years ago

i've actually thought about writing coq in rust like half a dozen times, I probably would have if i wasn't caught up in a huge rust project for work atm.

i wonder how little / much optimization you can do to beat coq in completion speeds, recently i actually rewrote coq's tokenizer to run in randomized order, both lengthwise, and front <-> backwards so when you have an input that is too long, and it exceeds the ingestion limit, you don't have a bias to avoid things not ever getting indexed if they are in the middle of it.


honestly i think a straight up rust port of coq with minor adjustments would smash it by a large margin, the python RPC client is like, written with gevent and stitched together with asyncio is like, really weird.

anyways heres what i've learned by writing coq that i can can be also applicable here:


anyways looking forward to this :)

ms-jpq commented 2 years ago

also try your hardest to not get in a serious relationship, you will have no time to program

ms-jpq commented 2 years ago

also I guess one of my biggest fears for projects like this is how much upstream dependency you are taking on, i basically wrote every single line of code including all dependencies outside of pynvim and pyyaml for coq, because like i just dont wanna be dependent on the graces of some upstream dude who may just decide to take a hiatus from open source.

i guess with mlua + nvim-oxi being like 30k+ lines of code that potentially wont be maintained, there is a risk that this project will have to take on some really heavy maintenance burdens

noib3 commented 2 years ago

Hey @ms-jpq,

Anyways, thanks a lot for your advice and for writing coq!

ms-jpq commented 2 years ago

oh shit, thats awesome, didnt know that you also maintained nvim-oxi.

man, and i thought i put in alot of work into nvim, you probably out did me by 4x haha

noib3 commented 1 year ago

the 30000 word txt file is gonna really suck the first time you use it

well, not as bad as I thought!

I'm going from user input to pixels on the screen in about ~30-35ms (w/ a debug build, not release) using the lipsum source which gives back 30k completions (and that's order of magnitudes more completions than real sources would provide).

About 70% of that is spent fuzzy matching the results, and I still have to implement caching which would remove the other 30% on most keystrokes.

https://user-images.githubusercontent.com/59321248/190917717-8af47922-2804-4fda-8990-5f874b00a434.mov

Also, the fuzzy matching crate I'm using at the moment seems to not be that well optimized, being ~2x slower compared to fzf (which is written in Go).

If I where to rewrite the fzf algo in Rust (which I'm probably gonna do eventually but not right now) and add caching ~I could bring that down even more to around 10-20ms~ EDIT: see below.

ms-jpq commented 1 year ago

the 30000 word txt file is gonna really suck the first time you use it

haha i meant more like in terms of the result of the dictionary basically overwhelming everything else, by virtue of being better matches :D

but those results are very very fast, :o. πŸ₯‡


Here's coq's results on a much much smaller input set, editing it's own source code.

Test on a M1 Max.

     | Avg Duration | Q01 Duration | Q50 Duration | Q95 Duration | Q99 Duration
───────────────────────────────────────────────────────────────────────────────
3P   | 7ms          | 4ms          | 7ms          | 12ms         | 17ms        
───────────────────────────────────────────────────────────────────────────────
BUF  | 9ms          | 4ms          | 9ms          | 15ms         | 20ms        
───────────────────────────────────────────────────────────────────────────────
LSP  | 11ms         | 4ms          | 7ms          | 30ms         | 57ms        
───────────────────────────────────────────────────────────────────────────────
PATH | 2ms          | 1ms          | 2ms          | 4ms          | 8ms         
───────────────────────────────────────────────────────────────────────────────
SNIP | 7ms          | 4ms          | 7ms          | 13ms         | 15ms        
───────────────────────────────────────────────────────────────────────────────
T9   | 50ms         | 2ms          | 73ms         | 90ms         | 91ms        
───────────────────────────────────────────────────────────────────────────────
TAG  | 6ms          | 3ms          | 6ms          | 12ms         | 16ms        
───────────────────────────────────────────────────────────────────────────────
TMUX | 4ms          | 2ms          | 3ms          | 6ms          | 12ms        
───────────────────────────────────────────────────────────────────────────────
TS   | 4ms          | 2ms          | 3ms          | 7ms          | 13ms        

This is after I rewrote the entire RPC engine from ground up https://github.com/ms-jpq/coq_nvim/pull/520, yielding some improvements in responsiveness here and there.

Honestly I think this is about as fast as coq can go, unless I ported it over to a faster language.

In practice, I did not even perceive any speed improvements from the RPC client rework on my M1, until I stepped down to a slower machine. You'd probably have to do this much sooner haha, maybe develop on a raspberry pi?

Anyways tho, the slowest source just dominates the q50 time no matter what we do, we just can't make LSP / Tabnine go faster, outside of cacheing.

noib3 commented 1 year ago

haha i meant more like in terms of the result of the dictionary basically overwhelming everything else, by virtue of being better matches :D

got it πŸ‘. Atm I'm only testing one source so I don't have to worry about that yet. What's coq doing to score completions coming from different sources other than fuzzy matching?

Honestly I think this is about as fast as coq can go, unless I ported it over to a faster language.

I'm actually pretty impressed with the LSP times. I assume you're getting those from the Neovim LSP client and that's the time it takes to get them, sort them w/ the other completions and send them back to Neovim to be displayed? 11ms seems almost too fast to get the response from the Language server and cross the RPC boundary twice.

Anyways tho, the slowest source just dominates the q50 time no matter what we do, we just can't make LSP / Tabnine go faster, outside of cacheing

Right but are you not sending the results independently? Or are you waiting for all sources to complete before sending the sorted completions?

ms-jpq commented 1 year ago

the measurements are taken from the moment coq sources run, not overall rpc -> rpc

the overall rpc->rpc roundtrip is about 20ish ms on my M1. this does not include what happens after vim receives the results, so no UI time is measured.

i aggregate all results for ranking before they are sent. the reason is well, for coding, not synthetic benchmarking, the slowest component is how fast you can read the results, and decide if they are useful or not.

a naive approach of shoving the cheaper sources to top i found, was actually way worse UX

ms-jpq commented 1 year ago

the overall time also includes me calling into nvim to request information on completion context, actually i just realized i am doing that stupidly just now, could probably yield more improvements if i adjusted it

image

edit:

If i turn on tabnine this thing goes to like 80ms average, and honestly maybe eyes are not good enough, i just cant tell too much difference between 80ms and 20ms on a nobrand 60 hz monitor

noib3 commented 1 year ago

I see. Thanks a lot for sharing coq's benchmarks, even though we're measuring times a bit differently it'll definitely be interesting to see how nvim-completion performs.

ms-jpq commented 1 year ago

im sure it will be πŸš€ πŸš€ πŸš€ <3

noib3 commented 1 year ago

If I where to rewrite the fzf algo in Rust (which I'm probably gonna do eventually but not right now) and add caching I could bring that down even more to around 10-20ms

Well nevermind, I've added caching and wanted to see how much of a difference using a release build of the plugin would make. The whole pipeline takes ~2-3ms, down from ~25-30ms with a debug build :exploding_head:

Screen Shot 2022-09-21 at 11 30 14 PM
ms-jpq commented 1 year ago

🀠

kuator commented 1 year ago

cowboy_hat_face

rip https://github.com/noib3/nvim-completion/issues/16#issuecomment-1320880996