gsuuon / model.nvim

Neovim plugin for interacting with LLM's and building editor integrated prompts.
MIT License
337 stars 24 forks source link
ai chatgpt llm neovim neovim-plugin nvim nvim-plugin openai palm prompt-engineering prompts

šŸ—æ model.nvim

Use AI models in Neovim for completions or chat. Build prompts programatically with lua. Designed for those who want to customize their prompts, experiment with multiple providers or use local models.

https://github.com/gsuuon/model.nvim/assets/6422188/3af3e65d-d13c-4196-abe1-07d605225c10

Features

Contents

If you have any questions feel free to ask in discussions


Setup

Requirements

With lazy.nvim

require('lazy').setup({
  {
    'gsuuon/model.nvim',

    -- Don't need these if lazy = false
    cmd = { 'M', 'Model', 'Mchat' },
    init = function()
      vim.filetype.add({
        extension = {
          mchat = 'mchat',
        }
      })
    end,
    ft = 'mchat',

    keys = {
      {'<C-m>d', ':Mdelete<cr>', mode = 'n'},
      {'<C-m>s', ':Mselect<cr>', mode = 'n'},
      {'<C-m><space>', ':Mchat<cr>', mode = 'n' }
    },

    -- To override defaults add a config field and call setup()

    -- config = function()
    --   require('model').setup({
    --     prompts = {..},
    --     chats = {..},
    --     ..
    --   })
    --
    --   require('model.providers.llamacpp').setup({
    --     binary = '~/path/to/server/binary',
    --     models = '~/path/to/models/directory'
    --   })
    --end
  }
})

Treesitter

To get treesitter highlighting of chat buffers with markdown injections, use :TSInstall mchat after model.nvim has been loaded (if you're using Lazy run :Lazy load model.nvim first). The grammar repo is at gsuuon/tree-sitter-mchat.

image

Secrets

If you prefer to keep keys out of your environment, they can also be set programmatically using :help vim.env or using the secrets field of the model.nvim setup table. config.secrets takes a table of functions which return a string - the function is called when the key is used and the result is cached for subsequent calls:

require('model').setup({
  secrets = {
    PROVIDER_API_KEY = function()
      return 'some key'
    end
  }
})

Usage

https://github.com/gsuuon/model.nvim/assets/6422188/ae00076d-3327-4d97-9cc1-41acffead327

model.nvim comes with some starter prompts and makes it easy to build your own prompt library. For an example of a more complex agent-like multi-step prompt where we curl for openapi schema, ask gpt for relevant endpoint, then include that in a final prompt look at the openapi starter prompt.

Prompts can have 5 different modes which determine what happens to the response: append, insert, replace, buffer, insert_or_replace. The default is to append, and with no visual selection the default input is the entire buffer, so your response will be at the end of the file. Modes are configured on a per-prompt basis.

Commands

Run prompts

Run a completion prompt

Start a new chat

Run a chat buffer

Telescope extension

If you use telescope, mchat buffers can be browsed with :Telescope model mchat.

Manage responses

Responses are inserted with extmarks, so once the buffer is closed the responses become normal text and won't work with the following commands.

Select response https://github.com/gsuuon/llm.nvim/assets/6422188/fd5aca13-979f-4bcf-8570-f935fdebbf03
Delete response https://user-images.githubusercontent.com/6422188/233774216-4e100122-3a93-4dfb-a7c7-df50f1221bdd.mp4
Cancel response https://user-images.githubusercontent.com/6422188/233773436-3e9d2a15-bc87-47c2-bc5b-d62d62480297.mp4
Show response https://user-images.githubusercontent.com/6422188/233773449-3b85355b-bad1-4e40-a699-6a8f5cf4bcd5.mp4

Manage context

There are some basic context management helpers which use the quickfix list:

File content of the quickfix list (what :MCpaste inserts) can be accessed programmatically via require('model.util.qflist').get_text(), for example:

local qflist = require('model.util.qflist')
local starters = require('model.prompts.chats')

config.chats = {
  ['codellama:qfix'] = vim.tbl_deep_extend('force', starters['together:codellama'], {
    system = 'You are an intelligent programming assistant',
    create = function()
      return qflist.get_text()
    end
  }),
}

šŸš§ WIP - Local vector store

Setup and usage ### Requirements - Python 3.10+ - `pip install numpy openai tiktoken` ### Usage Check the module functions exposed in [store](./lua/model/store/init.lua). This uses the OpenAI embeddings api to generate vectors and queries them by cosine similarity. To add items call into the `model.store` lua module functions, e.g. - `:lua require('model.store').add_lua_functions()` - `:lua require('model.store').add_files('.')` Look at `store.add_lua_functions` for an example of how to use treesitter to parse files to nodes and add them to the local store. To get query results call `store.prompt.query_store` with your input text, desired count and similarity cutoff threshold (0.75 seems to be decent). It returns a list of {id: string, content: string}: ```lua builder = function(input, context) ---@type {id: string, content: string}[] local store_results = require('model.store').prompt.query_store(input, 2, 0.75) -- add store_results to your messages end ```

Configuration

All setup options are optional. Add new prompts to options.prompts.[name] and chat prompts to options.chats.[name].

require('model').setup({
  default_prompt = {},
  prompts = {...},
  chats = {...},
  hl_group = 'Comment',
  join_undo = true,
})

Prompts

Prompts go in the prompts field of the setup table and are ran by the command :Model [prompt name] or :M [prompt name]. The commands tab-complete with the available prompts.

With lazy.nvim:

{
  'gsuuon/model.nvim',
  config = function()
    require('model').setup({
      prompts = {
        instruct = { ... },
        code = { ... },
        ask = { ... }
      }
    })
  end
}

A prompt entry defines how to handle a completion request - it takes in the editor input (either an entire file or a visual selection) and some context, and produces the api request data merging with any defaults. It also defines how to handle the API response - for example it can replace the selection (or file) with the response or insert it at the cursor positon.

Check out the starter prompts to see how to create prompts. Check out the reference for the type definitions.

Chat prompts

https://github.com/gsuuon/llm.nvim/assets/6422188/b5082daa-173a-4739-9690-a40ce2c39d15

Chat prompts go in the chats field of the setup table.

{
  'gsuuon/model.nvim',
  config = function()
    require('model').setup({
      prompts = { ... },
      chats = {
        gpt4 = { ... },
        mixtral = { ... }
        starling = { ... }
      }
    })
  end
}

Use :Mchat [name] to create a new mchat buffer with that chat prompt. The command will tab complete with available chat prompts. You can prefix the command with :horizontal Mchat [name] or :tab Mchat [name] to create the buffer in a horizontal split or new tab.

A brand new mchat buffer might look like this:

openai
---
{
  params = {
    model = "gpt-4-1106-preview"
  }
}
---
> You are a helpful assistant

Count to three

Run :Mchat in the new buffer (with no name argument) to get the assistant response. You can edit any of the messages, params, options or system instruction (the first line, if it starts with >) as necessary throughout the conversation. You can also copy/paste to a new buffer, :set ft=mchat and run :Mchat.

You can save the buffer with an .mchat extension to continue the chat later using the same settings shown in the header. mchat comes with some syntax highlighting and folds to show the various chat parts - name of the chatprompt runner, options and params in the header, and a system message.

Check out the starter chat prompts to see how to add your own. Check out the reference for the type definitions.

Library autoload

You can use require('util').module.autoload instead of a naked require to always re-require a module on use. This makes the feedback loop for developing prompts faster:

require('model').setup({
-  prompts = require('prompt_library')
+  prompts = require('model.util').module.autoload('prompt_library')
})

I recommend setting this only during active prompt development, and switching to a normal require otherwise.

Providers

The available providers are in ./lua/model/providers.

OpenAI ChatGPT

(default)

Set the OPENAI_API_KEY environment variable to your api key.

openai parameters

Parameters are documented here. You can override the default parameters for this provider by calling initialize:

    config = function()
      require('model.providers.openai').initialize({
        model = 'gpt-4-1106-preview'
      })
    end

openai prompt options

OpenAI prompts can take an additional option field to talk to compatible API's.

  compat = vim.tbl_extend('force', openai.default_prompt, {
    options = {
      url = 'http://127.0.0.1:8000/v1/'
    }
  })

For example, to configure it for Mistral AI "La plateforme":

  {
      "gsuuon/model.nvim",
      cmd = { "Model", "Mchat" },
      init = function()
          vim.filetype.add({ extension = { mchat = "mchat" } })
      end,
      ft = "mchat",
      keys = { { "<leader>h", ":Model<cr>", mode = "v" } },
      config = function()
          local mistral = require("model.providers.openai")
          local util = require("model.util")
          require("model").setup({
              hl_group = "Substitute",
              prompts = util.module.autoload("prompt_library"),
              default_prompt = {
                  provider = mistral,
                  options = {
                      url = "https://api.mistral.ai/v1/",
                      authorization = "Bearer YOUR_MISTRAL_API_KEY",
                  },
                  builder = function(input)
                      return {
                          model = "mistral-medium",
                          temperature = 0.3,
                          max_tokens = 400,
                          messages = {
                              {
                                  role = "system",
                                  content = "You are helpful assistant.",
                              },
                              { role = "user", content = input },
                          },
                      }
                  end,
              },
          })
      end,
  },

LlamaCpp

This provider uses the llama.cpp server.

You can start the server manually or have it autostart when you run a llamacpp prompt. To autostart the server call require('model.providers.llamacpp').setup({}) in your config function and set a model in the prompt options (see below). Leave model empty to not autostart. The server restarts if the prompt model or args change.

Setup

  1. Build llama.cpp
  2. Download the model you want to use, e.g. Zephyr 7b beta
  3. Setup the llamacpp provider if you plan to use autostart:

    config = function()
      require('model').setup({ .. })
    
      require('model.providers.llamacpp').setup({
        binary = '~/path/to/server/binary',
        models = '~/path/to/models/directory'
      })
    end
  4. Use the llamacpp provider in a prompt:

    local llamacpp = require('model.providers.llamacpp')
    
    require('model').setup({
      prompts = {
        zephyr = {
          provider = llamacpp,
          options = {
            model = 'zephyr-7b-beta.Q5_K_M.gguf',
            args = {
              '-c', 8192,
              '-ngl', 35
            }
          },
          builder = function(input, context)
            return {
              prompt =
                '<|system|>'
                .. (context.args or 'You are a helpful assistant')
                .. '\n</s>\n<|user|>\n'
                .. input
                .. '</s>\n<|assistant|>',
              stops = { '</s>' }
            }
          end
        }
      }
    })

LlamaCpp setup options

Setup require('model.providers.llamacpp').setup({})

LlamaCpp prompt options

Ollama

This uses the ollama REST server's /api/generate endpoint. raw defaults to true, and stream is always true.

Example prompt with starling:

  ['ollama:starling'] = {
    provider = ollama,
    params = {
      model = 'starling-lm'
    },
    builder = function(input)
      return {
        prompt = 'GPT4 Correct User: ' .. input .. '<|end_of_turn|>GPT4 Correct Assistant: '
      }
    end
  },

Google PaLM

Set the PALM_API_KEY environment variable to your api key.

The PaLM provider defaults to the text model (text-bison-001). The builder's return params can include model = 'chat-bison-001' to use the chat model instead.

Params should be either a generateText body by default, or a generateMessage body if using model = 'chat-bison-001'.

palm = {
  provider = palm,
  builder = function(input, context)
    return {
      model = 'text-bison-001',
      prompt = {
        text = input
      },
      temperature = 0.2
    }
  end
}

Together

Set the TOGETHER_API_KEY environment variable to your api key. Talks to the together inference endpoint.

  ['together:phind/codellama34b_v2'] = {
    provider = together,
    params = {
      model = 'Phind/Phind-CodeLlama-34B-v2',
      max_tokens = 1024
    },
    builder = function(input)
      return {
        prompt = '### System Prompt\nYou are an intelligent programming assistant\n\n### User Message\n' .. input  ..'\n\n### Assistant\n'
      }
    end
  },

Huggingface API

Set the HUGGINGFACE_API_KEY environment variable to your api key.

Set the model field on the params returned by the builder (or the static params in prompt.params). Set params.stream = false for models which don't support it (e.g. gpt2). Check huggingface api docs for per-task request body types.

  ['hf:starcoder'] = {
    provider = huggingface,
    options = {
      model = 'bigcode/starcoder'
    },
    builder = function(input)
      return { inputs = input }
    end
  },

Kobold

For older models that don't work with llama.cpp, koboldcpp might still support them. Check their repo for setup info.

Langserve

Set the output_parser to correctly parse the contents returned from the /stream endpoint and use the builder to construct the input query. The below uses the example langserve application to make a joke about the input text.

  ['langserve:make-a-joke'] = {
    provider = langserve,
    options = {
      base_url = 'https://langserve-launch-example-vz4y4ooboq-uc.a.run.app/',
      output_parser = langserve.generation_chunk_parser,
    },
    builder = function(input, context)
      return {
        topic = input,
      }
    end
  },

Adding your own

Providers implement a simple interface so it's easy to add your own. Just set your provider as the provider field in a prompt. Your provider needs to kick off the request and call the handlers as data streams in, finishes, or errors. Check the hf provider for a simpler example supporting server-sent events streaming. If you don't need streaming, just make a request and call handler.on_finish with the result.

Basic provider example:

local test_provider = {
  request_completion = function(handlers, params, options)
    vim.notify(vim.inspect({params=params, options=options}))
    handlers.on_partial('a response')
    handlers.on_finish()
  end
}

require('model').setup({
  prompts = {
    test_prompt = {
      provider = test_provider,
      builder = function(input, context)
        return {
          input = input,
          context = context
        }
      end
    }
  }
})

Reference

The following are types and the fields they contain:

SetupOptions

Setup require('model').setup(SetupOptions)

Prompt

params are generally data that go directly into the request sent by the provider (e.g. content, temperature). options are used by the provider to know how to handle the request (e.g. server url or model name if a local LLM).

Setup require('model').setup({prompts = { [prompt name] = Prompt, .. }})
Run :Model [prompt name] or :M [prompt name]

Provider

ParamsBuilder

(function)

SegmentMode

(enum)

Exported as local mode = require('model').mode

StreamHandlers

ChatPrompt

params are generally data that go directly into the request sent by the provider (e.g. content, temperature). options are used by the provider to know how to handle the request (e.g. server url or model name if a local LLM).

Setup require('model').setup({chats = { [chat name] = ChatPrompt, .. }})
Run :Mchat [chat name]

ChatMessage

ChatConfig

ChatContents

Context

Selection

Position

Examples

Prompts

require('model').setup({
  prompts = {
    ['prompt name'] = ...
  }
})
Ask for additional user instruction https://github.com/gsuuon/llm.nvim/assets/6422188/0e4b2b68-5873-42af-905c-3bd5a0bdfe46 ```lua ask = { provider = openai, params = { temperature = 0.3, max_tokens = 1500 }, builder = function(input) local messages = { { role = 'user', content = input } } return util.builder.user_prompt(function(user_input) if #user_input > 0 then table.insert(messages, { role = 'user', content = user_input }) end return { messages = messages } end, input) end, } ```
Create a commit message based on `git diff --staged` https://user-images.githubusercontent.com/6422188/233807212-d1830514-fe3b-4d38-877e-f3ecbdb222aa.mp4 ```lua ['commit message'] = { provider = openai, mode = mode.INSERT, builder = function() local git_diff = vim.fn.system {'git', 'diff', '--staged'} return { messages = { { role = 'system', content = 'Write a short commit message according to the Conventional Commits specification for the following git diff: ```\n' .. git_diff .. '\n```' } } } end, } ```
Modify input to append messages https://user-images.githubusercontent.com/6422188/233748890-5dac719a-eb9a-4f76-ab9d-8eba3694a350.mp4 #### `lua/prompt_library.lua` ```lua --- Looks for `
Replace text with Spanish ```lua local openai = require('model.providers.openai') local segment = require('model.util.segment') require('model').setup({ prompts = { ['to spanish'] = { provider = openai, hl_group = 'SpecialComment', builder = function(input) return { messages = { { role = 'system', content = 'Translate to Spanish', }, { role = 'user', content = input, } } } end, mode = segment.mode.REPLACE } } }) ```
Notifies each stream part and the complete response ```lua local openai = require('model.providers.openai') require('model').setup({ prompts = { ['show parts'] = { provider = openai, builder = openai.default_builder, mode = { on_finish = function (final) vim.notify('final: ' .. final) end, on_partial = function (partial) vim.notify(partial) end, on_error = function (msg) vim.notify('error: ' .. msg) end } }, } }) ```

Configuration

You can move prompts into their own file and use util.module.autoload to quickly iterate on prompt development.

Setup #### `config = function()` ```lua local openai = require('model.providers.openai') -- configure default model params here for the provider openai.initialize({ model = 'gpt-3.5-turbo-0301', max_tokens = 400, temperature = 0.2, }) local util = require('model.util') require('model').setup({ hl_group = 'Substitute', prompts = util.module.autoload('prompt_library'), default_prompt = { provider = openai, builder = function(input) return { temperature = 0.3, max_tokens = 120, messages = { { role = 'system', content = 'You are helpful assistant.', }, { role = 'user', content = input, } } } end } }) ```
Prompt library #### `lua/prompt_library.lua` ```lua local openai = require('model.providers.openai') local segment = require('model.util.segment') return { code = { provider = openai, builder = function(input) return { messages = { { role = 'system', content = 'You are a 10x super elite programmer. Continue only with code. Do not write tests, examples, or output of code unless explicitly asked for.', }, { role = 'user', content = input, } } } end, }, ['to spanish'] = { provider = openai, hl_group = 'SpecialComment', builder = function(input) return { messages = { { role = 'system', content = 'Translate to Spanish', }, { role = 'user', content = input, } } } end, mode = segment.mode.REPLACE }, ['to javascript'] = { provider = openai, builder = function(input, ctx) return { messages = { { role = 'system', content = 'Convert the code to javascript' }, { role = 'user', content = input } } } end, }, ['to rap'] = { provider = openai, hl_group = 'Title', builder = function(input) return { messages = { { role = 'system', content = "Explain the code in 90's era rap lyrics" }, { role = 'user', content = input } } } end, } } ```

Contributing

New starter prompts, providers and bug fixes are welcome! If you've figured out some useful prompts and want to share, check out the discussions.

Roadmap

I'm hoping to eventually add the following features - I'd appreciate help with any of these.

Local retrieval augmented generation

The basics are here - a simple json vectorstore based on the git repo, querying, cosine similarity comparison. It just needs a couple more features to improve the DX of using from prompts.

Enhanced context

Make treesitter and LSP info available in prompt context.