David-Kunz / gen.nvim

Neovim plugin to generate text using LLMs with customizable prompts
The Unlicense
1.15k stars 92 forks source link

Allow for ollama REST API usage #30

Closed RubenSmn closed 10 months ago

RubenSmn commented 11 months ago

Since I am running ollama on a different machine I would like to be able to use this plugin through the REST API instead of starting a new service on my working machine.

Maybe this could be done by some configuration?

wishuuu commented 10 months ago

Hi @RubenSmn

Right now using ollama through REST API isn't implemented here, but it sounds like great feature. It shouldn't also be hard to achive, but there are small problem with ollama's REST API.

If you send request to 'generate' endpoint, you will have to wait for whole result to be generated before seeing anything, as it returns big object containg whole result at once, instead of returning token after token like local ollama version. It may be okay if remote server is really strong machine and there are not that much other requests, but for most cases, it will result in long staring at blank buffor.

Maybe if ollama Team will implement API with 2 way communication protocol like WebSocket, so it could return result token as they are generated, this feature may be worth considering. Right now I feel like this is useful for very narrow group of users.

With features that are already available, you could try setting up passwordless ssh connection on your remote machine and setting up command like this: require('gen').command = 'ssh user@machine %'ollama run $model $prompt%''

You may also need to fork this repo and comment lines :72 to :76 of gen/init.lua file of your local version, as some users experienced problems with these few lines on machines without ollama installed.

I've not tested this myself, but it should work in your case.

Oskar

David-Kunz commented 10 months ago

Thank you, @RubenSmn for the feature request and @wishuuu for the detailed explanation. I think it's important to stream the result, so I agree with you.

I wonder if it's also possible to use

require('gen').command = 'curl ...'

to use the REST API. It just must be ensured that the result is piped to stdout.

wishuuu commented 10 months ago

Hi @David-Kunz

Unfortunately, doing only curl by itself and piping output to stdout is not solution either. Result would be unreadable that way. To show what I mean by that, look at example I've prepared:

Request: curl --location 'http://localhost:11434/api/generate' \ --header 'Content-Type: application/json' \ --data '{ "model": "codellama", "prompt": "Generate C# function to calculate nth fibonacci number" }' Response: { "model": "codellama", "created_at": "2023-11-15T09:19:17.105093041Z", "response": "```", "done": false } { "model": "codellama", "created_at": "2023-11-15T09:19:17.29534601Z", "response": "\n", "done": false } { "model": "codellama", "created_at": "2023-11-15T09:19:17.447164889Z", "response": "public", "done": false } { "model": "codellama", "created_at": "2023-11-15T09:19:17.612943733Z", "response": " static", "done": false } { "model": "codellama", "created_at": "2023-11-15T09:19:17.769394156Z", "response": " int", "done": false }...

As you see, REST API response have to be parsed, and concatination response params of all return objects should only be displayed to user.

Oskar

frbor commented 10 months ago

I do not have the complete solution for setup, since I think you might need to escape the prompt before you put it into the json query, but with stream: false and jq to get the response part you can use this one liner:

curl --silent http://localhost:11434/api/generate --data '{ "model": "mistral:instruct", "prompt": "Generate C#
 function to calculate nth fibonacci number", "stream": false }' | jq -r .response
public static int Fibonacci(int n) {
    if (n <= 1) return n;

    int a = 0, b = 1, sum;
    for (int i = 2; i <= n; i++) {
        sum = a + b;
        a = b;
        b = sum;
    }
    return b;
}
David-Kunz commented 10 months ago

Exactly, whatever command you use must produce the proper result, jq is a perfect fit for that.

However, I noticed that $prompt gets replaced by the shell-escaped one, that might cause some problems. It might be that we need to switch to REST-based communication anyway since Ollama did some changes to their CLI tool. I think I'll merge https://github.com/David-Kunz/gen.nvim/pull/36 once it reaches stability (plus some minor adjustments).

David-Kunz commented 10 months ago

Now, gen.nvim uses HTTP to communicate with Ollama.