FIM completion flexible context

twinnydotdev / twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.

https://twinny.dev

MIT License

2.91k stars 153 forks source link

FIM completion flexible context #257

Open kv-gits opened 4 months ago

kv-gits commented 4 months ago

Would like to setup/choose the context of FIM completion. Current function/file/dependent files/project instead of fixed sized. I know, it can be complicated for different languages. But if possible, would be nice.

rjmacarthy commented 3 months ago

Hello, please could you explain in some detail what is most important in this respect and any technical details which might be helpful when adding this functionality?

Many thanks.

slashedstar commented 2 months ago

I was fiddling with the context length option and it didn't seem to affect how much VRAM was being used, so I assume the context is fixed and even if you set it to include 999 lines it will only include whatever it can fit in the default context limit (2048), shouldn't we be able to adjust the actual token context window (num_ctx) instead?

AndrewRocky commented 3 weeks ago

I was fiddling with the context length option and it didn't seem to affect how much VRAM was being used

@slashedstar, Most LLM providers (their engines, to be exact) preallocate VRAM for defined context length during model loading.

If Twinny sends longer context than provider's max context length - the beginning of context will get truncated.

slashedstar commented 3 weeks ago

I was fiddling with the context length option and it didn't seem to affect how much VRAM was being used

@slashedstar, Most LLM providers (their engines, to be exact) preallocate VRAM for defined context length during model loading.

If Twinny sends longer context than provider's max context length - the beginning of context will get truncated.

I forgot to mention I was using Ollama, I assumed twinny would've automatically expand the context as needed (by passing the needed num_ctx ) because continue does this, when you change the context length setting for FIM the VRAM usage changes accordingly, maybe I didn't properly set it up at the time, can't really remember.