continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
19.22k stars 1.65k forks source link

Preload model for the ollama provider #1190

Open sgwhat opened 6 months ago

sgwhat commented 6 months ago

Validations

Problem

I've noticed that when I use ollama to chat, the model loading always occurs during the first round of conversation, which makes the first round much slower than subsequent ones.

I'm exploring ways to help ollama preload the model. Even though I tried ollama run llama2:latest before conversation, the model still loads at the start of the first conversation.

Solution

No response

sestinj commented 6 months ago

@sgwhat https://github.com/ollama/ollama/blob/main/docs/api.md#load-a-model

This might be a good solution

sgwhat commented 6 months ago

@sgwhat https://github.com/ollama/ollama/blob/main/docs/api.md#load-a-model

This might be a good solution

I have tired, but still useless.