twinnydotdev / twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.
https://twinny.dev
MIT License
3.1k stars 165 forks source link

Separate options for amount of lines 'before' and 'after' the current line in FIM prompts #298

Open AndrewRocky opened 2 months ago

AndrewRocky commented 2 months ago

Is your feature request related to a problem? Please describe. When editing the beginning of a long file, prompt evaluation takes a lot of time. Reason for that - in Additional context

Currently we send similar amount of lines from top and bottom. I believe that we have reasons to make the bottom part smaller:

  1. It takes a long time to reevaluate bottom lines
  2. Bottom lines often aren't as important (IMO). This way we can have more context window left for top lines.

Describe the solution you'd like I want to have separated options for Context Length for 'before' and 'after'.

Describe alternatives you've considered Or maybe leave current Twinny: Context Length as is, but add optional override for bottom lines.

Additional context For context: AFAIK (this is mostly based on my assumptions), llama.cpp doesn't have to reevaluate prefix part of prompt that haven't changed since last generation. But the moment it encounters a change - it will start reevaluating everything after that change. So when we have 2 requests in a row with prompts:

<|fim▁begin|>
import numpy
<|fim▁hole|>
print('Hello World!')<|fim▁end|>
<|fim▁begin|>
import numpy
import<|fim▁hole|>
print('Hello World!')<|fim▁end|>

It won't have to spend time on evaluating import numpy. However, it will still have to run everything after <|fim▁hole|> (because it only checks for prefix in prompt). (Example of llama.cpp output (not for this exact case): Llama.generate: 2978 prefix-match hit, remaining 8 prompt tokens to eval)

rjmacarthy commented 2 months ago

Hey, currently we use 0.85 for prefix and 0.15 for context, I guess we could make it configurable though.