Llama.cpp hosted Condestral-22B not getting correct templates.

Before submitting your bug report

[X] I believe this is a bug. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that reports the same bug
[X] I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS: Windows 11 Pro
- Continue: v0.9.150 (pre-release)
- IDE: VSCode v1.89.1

Description

I have the following model definition for Codestral-22B running locally on llama.cpp server:

    {
      "title": "Codestral",
      "model": "Codestral-22B",
      "contextLength": 16384,
      "completionOptions": {},
      "apiBase": "http://localhost:8080",
      "provider": "llama.cpp",
      "template": "llama2"
    }

Code editing works, but not chat. In order for regular chat to work, I have to specify the llama2 template. Without it, I get the following error popup when using chat.

Error: You must either implement templateMessages or _streamChat

When I do specify the template as llama2, code editing (Ctrl+i) for highlighting code and requesting changes no longer works and give this different error:

Error streaming diff: TypeError: templateMessages is not a function

Looking through the code in autodetect.ts, it appears that the edit template is supposed to be osModelsEditPrompt, but the else if catches it earlier and assigns it llama2.

I can't make sense of the errors I got, but either moving

  } else if (model.includes("codestral")) {
    editTemplate = osModelsEditPrompt;
  }

...up before } else if (templateType === "llama2") { (line 282) or changing the autodetect logic would "make" it work, but I'm not sure if that fixes the root issue.

Edit - partial solution

I have found the following partial solution that has gotten me up and running, but is a bit of a pain since I'm swapping models often for testing and benchmarking. To fix, I've added a second model definition with no template that will be used for editing, and added modelRoles with inlineEdit defined to point to it. My new config:

{
  "models": [
    {
      "title": "Codestral",
      "model": "Codestral-22B",
      "contextLength": 16384,
      "completionOptions": {},
      "apiBase": "http://localhost:8080",
      "provider": "llama.cpp",
      "template": "llama2"
    },
    {
      "title": "Codestral - Edit",
      "model": "Codestral-22B",
      "contextLength": 16384,
      "completionOptions": {},
      "apiBase": "http://localhost:8080",
      "provider": "llama.cpp"
    }
  ],
  "disableSessionTitles": true,
  "experimental": {
    "modelRoles": {
      "inlineEdit": "Codestral - Edit"
    }
  }
}

If there is a better way to do this, please let me know. Thanks!

continuedev / continue