twinnydotdev / twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.
https://twinny.dev
MIT License
2.36k stars 130 forks source link

Add starcoder2 (and dolphincoder) support to autocomplete (not complete yet) #174

Closed hafriedlander closed 3 months ago

hafriedlander commented 4 months ago

This needed two changes:

To do before this could be merged:

And some other ideas to improve results generally:

BTW, I think there's a bug in the file-interaction code - onDidOpenTextDocument doesn't track focus or which window is active, multiple files can be "open" in different windows at the same time. I think it should be rewritten to use onDidChangeActiveTextEditor instead?

Raising for now to start discussion.

hafriedlander commented 4 months ago

BTW, here's the list of extra stop-words that Continue uses per language. https://github.com/continuedev/continue/blob/main/core/autocomplete/languages.ts.

rjmacarthy commented 4 months ago

This is awesome thanks, is it ready for merge or are you adding more commits?

Edit: I see that there are more commits to come, no problem. One problem I have with starcoder2 is that it completions are followed by random code from other source files, do you notice it too? Does this PR fix that?

I realise that there are some improvements to make on FIM completions but I have not had the time to concentrate on it. Those tests we're a bit brittle and a foundation so we can skip them if necessary for now to make improvements. I welcome more PRs from you if you have some improvements you would like to make.

Many thanks,

hafriedlander commented 4 months ago

Yeah, I do notice starcoder2 doesn't know when to stop - or at least the 15B version I use. There's a note in the paper that they messed up the 15B FIM training, but the actual results (at least with the dolphincoder finetune) seem good. I'll test with 7B later. I have an idea for how to fix it anyway though.

(I still think deepseek-coder-33B gives better results, but you need different -base and -instruct versions and it's a much bigger model generally)

rjmacarthy commented 3 months ago

Since this PR is a month old, will close for now, many conflicts and things have changed. Please reopen if willing to finish, many thanks.