twinnydotdev / twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.
https://twinny.dev
MIT License
2.3k stars 126 forks source link

Instructions for the configuration on MacOS with llama.cpp #211

Closed a-rbts closed 2 months ago

a-rbts commented 2 months ago

Greetings, and thanks for your hard work! I am trying to setup the extension properly as instructed in the README.md, but it does not seem the UI matches what's described there. I am running llama.cpp, server, which offers an OpenAI compliant api.

  1. my first question would be to know whether I need to run two instances, one with an instruct version of the model for chat and one with the base version for FIM, or whether the extension uses the same model. The instructions advise for deepseek BASE for chat, but also deepseek base for completion with a good GPU. I am not sure why the chat function requires a base model.
    1. From the top ⚙️ icon open the settings page and in the Api Provider panel change from ollama to llamacpp (or others respectively).

When I open the side panel and chooses the configuration, there is no Api Provider, instead, there are just fields for the Ollama Hostname and Ollama API Port, but I am not using ollama (screenshot below). How/Where can we select llama.cpp as per instructions?

Screenshot 2024-04-15 at 11 48 04
  1. Eventually (and maybe related issue), in the side panel, clicking on the robot emoji shows two dropdown boxes for chat and FIM, but only one option is present there, and is tagged "ollama" (screenshot below). No other possibility is displayed. Screenshot 2024-04-15 at 11 53 08
rjmacarthy commented 2 months ago

Hello, thanks for the interest, please allow me to answer your questions from a personal perspective.

  1. I usually run two different models code and instruct or base and instruct because they give better results as the models are trained for specific tasks. I think you might be able to only run codellama:7b but depending on the api it might perform bad on one or the other task (chat, fim) for a number of reasons sometimes because the prompt templates differ or are automatically formatted by some providers.

  2. The ollama settings in the settings menu only point towards ollma so that I can fetch the models from the api, that is really its only purpose. Providers should be added under the menu here. image

This UI allows you to setup different providers and you can switch between them in the model selection chat interface.

Hope that helps!

a-rbts commented 2 months ago

Great thanks for the explanation I totally missed the providers menu, but this is what I was looking for. On the other hand, it doesn't seem to work well with llama.cpp. The chat doesn't work if the provider field is set to be "lamacpp" but it works perfectly when selecting "oobabooga" (but still using llamacpp server as the api provider). It seems it has to do with the message format. Not sure why there is anything different between both providers in the configuration, it seems to be incorrect. On the other hand, I could not get FIM work get with either llamacpp or oobabooga. llamacpp receives requests but doesn't answer anything as queries seem to be malformed (using deepseek coder base). With oobabooga, the server doesn't seem to receive requests at all even with the right provider and port selected. I will be able to investigate this now that I am able to get something, so closing the issue.

a-rbts commented 2 months ago

Adding more information here:

Hope it helps.