huggingface / llm.nvim

LLM powered development for Neovim
Apache License 2.0
741 stars 46 forks source link

Can't get to work with ollama #79

Closed Bios-Marcel closed 6 months ago

Bios-Marcel commented 6 months ago

I am getting the follwoing when hitting tab in insert mode:

Error executing vim.schedule lua callback: ...cal/share/nvim/lazy/llm.nvim/lua/llm/language_server.lua:154: attempt to index local 'completion_result' (a nil value)

This is my config

  {
    'huggingface/llm.nvim',
    config = function()
      require('llm').setup({
        model = "starcoder2:7b",
        backend = "ollama",
        url = "http://localhost:11434/api/generate",
        -- cf https://github.com/ollama/ollama/blob/main/docs/api.md#parameters
        request_body = {
          -- Modelfile options for the model you use
          options = {
            temperature = 0.2,
            top_p = 0.95,
          }
        }
      })
    end
  },

Additionally, the command LLMSuggestion does nothing and auto suggestion doesnt seem to do anythingeither . ollama is installed, including the specified model.

Other models dont work either. Using the same model via ollama run does work. It also seems to be spawning the respective ollama serve processes.

Any idea what could be going on?

shanehull commented 6 months ago

@Bios-Marcel you need to specify the lsp.

If you're using mason:

{
  -- ...
  lsp = {
    bin_path = vim.api.nvim_call_function("stdpath", { "data" }) .. "/mason/bin/llm-ls",
  },
  -- ...
}
Bios-Marcel commented 6 months ago

While it solves the error. I still don't get completions and the LLMSuggestion command does nothing. Am i missing something?

shanehull commented 6 months ago

I can't seem to get starcoder2 to work either.

Try starcoder to verify your config.

return {
    {
        "huggingface/llm.nvim",
        opts = {
            enable_suggestions_on_files = {
                "*.*",
            },
            backend = "ollama",
            model = "starcoder",
            url = "http://127.0.0.1:11434/api/generate",
            lsp = {
                bin_path = vim.api.nvim_call_function("stdpath", { "data" }) .. "/mason/bin/llm-ls",
            },
            tokenizer = {
                repository = "bigcode/starcoder",
            },
        },
    },
}
life00 commented 6 months ago

The following config works well for me:

  {
    "huggingface/llm.nvim",
    opts = {
      backend = "ollama",
      model = "codellama:7b",
      accept_keymap = "<S-CR>",
      dismiss_keymap = "<CR>",
      url = "http://localhost:11434/api/generate",
      request_body = {
        options = {
          temperature = 0.2,
          top_p = 0.95,
        },
      },
      enable_suggestions_on_startup = false,
      lsp = {
        bin_path = vim.api.nvim_call_function("stdpath", { "data" }) .. "/mason/bin/llm-ls",
      },
    },
  },
Bios-Marcel commented 6 months ago

meh, nothing works for me, either way, guess imma close the issue, it probably doesn't provide value and I don't deem this worth investigating. I'll just type like a monkey :D

life00 commented 6 months ago

While it solves the error. I still don't get completions and the LLMSuggestion command does nothing. Am i missing something?

@Bios-Marcel I think I know the problem, you should just wait longer. Try manually running ollama server in a separate terminal: ollama serve. When you trigger LLMSuggestion ollama should output something. Make sure to be in insert mode on next line after comment/code. I suspect you just didn't wait enough for it to generate and suggest something.

Tbh I personally don't find such an LSP useful unless you have good enough hardware (which I don't), because it takes too long to generate a response, and smaller code models (which are faster) may produce worse results or hallucinate a lot.

If you are also limited by hardware like me I recommend you take a look at https://github.com/David-Kunz/gen.nvim . It has a pretty similar development workflow, but instead of LSP like experience, you can select code that you want to ask about which reduces load time. You can also see the response load in real time.

McPatate commented 4 months ago

@life00 appears to be correct in that running models locally for autocompletion requires a beefy setup, generating completions can be slow.