gsuuon / model.nvim

Neovim plugin for interacting with LLM's and building editor integrated prompts.
MIT License
293 stars 21 forks source link

Failing to connect/start the llama.cpp server #34

Closed mutlusun closed 8 months ago

mutlusun commented 8 months ago

Thank you for creating this neovim plugin and publishing it online!

I tried the plugin to access a local llama.cpp server. If I start the server manually, everything works fine. However, if I let llama.cpp start the server (as described here), I get the following error message in neovim:

curl: (7) Failed to connect to 127.0.0.1 port 8080 after 6 ms: Couldn't connect to server

I tried to set some curl arguments to increase the timeout time, but had still no luck.

My configuration looks like that:

require('llm').setup({
    llamacpp = {
        provider = llamacpp,
        options = {
            server_start = {
                command = "~/Documents/src/llama.cpp/server",
                args = {
                    "-m", "~/Documents/src/llama.cpp/models/open-llama-7b-v2/ggml-model-q4_0.gguf",
                    "-c", 2048,
                    --"-c", 4096,
                    --"-ngl", 22
                }
            },
        },
        builder = function(input, context)
            return {
                prompt = llamacpp.llama_2_user_prompt({
                    user = context.args or '',
                    message = input
                })
            }
        end,
    },
})

Do I something wrong here? I'm sorry, if this is explained somehow, but I couldn't find more information on this problem. Thank you for your help and time!

gsuuon commented 8 months ago

Hi! Can you double check that the command field points to the server binary? On windows it should probably be server.exe - although you should be getting a message that starting the server failed, which may be a bug if not

mutlusun commented 8 months ago

Thanks for your feedback! Yes, the path of the command field is correct. It is /Users/***/Documents/src/llama.cpp/server as I'm currently working on MacOS. I tried it now with an absolute path but that didn't change anything.

I get the same error message as above even if I specify a wrong path to the server binary. The binary is an executable. I also assured that I'm on the last commit.

gsuuon commented 8 months ago

Wow, I missed it again - I just noticed your setup is incorrect, the prompt needs to be in the prompts field and not top level so:

require('llm').setup({
  prompts = {
    llamacpp = {
      provider = llamacpp,
      options = {
        server_start = {
          command = "~/Documents/src/llama.cpp/server",
          args = {
            "-m", "~/Documents/src/llama.cpp/models/open-llama-7b-v2/ggml-model-q4_0.gguf",
            "-c", 2048,
            --"-c", 4096,
            --"-ngl", 22
          }
        },
      },
      builder = function(input, context)
        return {
          prompt = llamacpp.llama_2_user_prompt({
            user = context.args or '',
            message = input
          })
        }
      end,
    },
  }
})

I'll be improving the docs soonish! Sorry this wasn't clear.

mutlusun commented 8 months ago

Again, thank you for your fast help! I'm sorry that I missed this point in the Readme. After you mentioned it, I saw it there. Just for reference, the following config works now:

local llamacpp = require("llm.providers.llamacpp")

require("llm").setup({
    prompts = {
        llamacpp = {
            provider = llamacpp,
            options = {
                server_start = {
                    command = "/Users/***/Documents/src/llama.cpp/server",
                    args = {
                        "-m", "/Users/***/Documents/src/llama.cpp/models/open-llama-7b-v2/ggml-model-q4_0.gguf",
                        "-c", 2048,
                        --"-c", 4096,
                        --"-ngl", 22
                    }
                },
            },
            builder = function(input, context)
                return {
                    prompt = llamacpp.llama_2_user_prompt({
                        user = context.args or "",
                        message = input
                    })
                }
            end,
        },
    },
})

Thus, I had to require llm.proviers.llamacpp and I actually needed an absolute path in the server and model path.