continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
15.73k stars 1.19k forks source link

Enable context providers to be automatically included in prompts #1881

Open Patrick-Erichsen opened 1 month ago

Patrick-Erichsen commented 1 month ago

Validations

Problem

https://discord.com/channels/1108621136150929458/1108621136830398496/1265344617554116751

Is there any way to add some additional custom handling into the autocomplete completion provider? Something that could provide additional chunks of context data depending on what was being edited? I want to take some internal company data that would apply to building graphql operations and be able to include that in autocomplete, or even in chat if certain words were used, but without needing to use an annotation. Looking for any ideas.

Solution

No response

mark-bradshaw commented 1 month ago

Hi @Patrick-Erichsen Thanks for opening the issue.

To clarify our need...

PROBLEM: We want to provide custom context to our company developer's questions, but they frequently forget to manually invoke the context provider, don't know when to invoke it, or just feel like it's a hassle to keep invoking it. This leads to lower engagement with our custom context provider and lower quality answering by the LLM.

PROPOSAL: I'd like to option to allow a custom context provider to opt in to looking at every user message, determine it's something I care about, and then pull context items related specifically to the message. The message could be matched against regexes for speed, or even quickly bounced off a small model for an evaluation if timing permits it. Being able to opt in to see every message would allow the context provider to be globally available without needing special invocation. This would remove the hurdle to getting context in front of every developer and raise the overall answer quality.

Ideally the configuration for enabling this would be done at the custom provider level, so that the user has minimal work to do to enable the provider without making any mistakes.

DANGERS: Some possible dangers we'd need to avoid:

  1. Overeager providers might inflate context sizes
  2. Not having access to the chat history might cause a provider to not get invoked
  3. Providers that see every message without being invoked could potentially exfiltrate data

To avoid these dangers we might:

  1. Cap the allowed context size that can be returned (might already be done)
  2. Include a chat history object to context providers so that they can see if they have recently needed to be invoked and might still want to provide additional data (might already be done)
  3. Add a warning when enabling a custom context provider that will see every message.
mark-bradshaw commented 1 month ago

Here's a similar issue: https://github.com/continuedev/continue/issues/1730

sestinj commented 1 month ago

@mark-bradshaw thanks for the full expansion here. All of this sounds reasonable to me.

When you say

Ideally the configuration for enabling this would be done at the custom provider level, so that the user has minimal work to do to enable the provider without making any mistakes.

this leads me to think of a few methods of configuration:

1) (this is already done, it just doesn't support arbitrary context providers quite yet) - a top-level property that takes an array of context provider names

"experimental": {
    "defaultContext": ["activeFile"]
  }

2) a property that exists on every context provider determining whether it should be used by default

  "contextProviders": [
    {
      "name": "url"
    },
    {
      "name": "folder"
    },
    {
      "name": "docs"
    },
    {
      "name": "os",
      "includeByDefault": true
    }
  ],

3) we actually outsource this to a special .prompt file: It can override the template for every single message, and as a part of doing this call a context provider. Perhaps a special filename like default.prompt, you could then just give it the contents {{{ myContextProviderName }}}

My gut is that (3) isn't easy enough for users to set up, and that (2) could be confusing because for something like "docs" you actually need to specify which docs you even want to use. So preference to (1), but would welcome your thoughts

mark-bradshaw commented 1 month ago

Thanks for the feedback @sestinj . My assumption on configuration would've been option #2, but option #1 seems like it would be fine too. Extra points that it's already mostly implemented, so that's handy.

I'd be happy to contribute, but I'm unsure of my availability at this point. It might be a while.