Restoring Ollama state correctly

karthink / gptel

A simple LLM client for Emacs

GNU General Public License v3.0

1.04k stars 113 forks source link

Restoring Ollama state correctly #181

Closed karthink closed 1 month ago

karthink commented 5 months ago

The state of Ollama chats cannot be restored accurately since we are not (presently) storing the chat embedding vector the API returns. Storing this is simple, but this vector tends to be large, and will cause two issues with Emacs:

The buffer size will increase by a factor of 2 when saved to disk.
It will be stored as a string on a single logical line, and will trigger Emacs' long lines issues.

The option of storing it separately in a data file (ASCII or binary) is off the table: one gptel's objectives is to produce self-contained, single file chats that are portable and reproducible modulo pinning the Emacs+gptel version. (And we try very hard to stay backward compatible so the pinning isn't necessary.)

luyangliuable commented 4 months ago

Is this why we get the Ollama error (nil): Malformed JSON in response. error whenever we receive a response from ollama?

karthink commented 4 months ago

Is this why we get the Ollama error (nil): Malformed JSON in response. error whenever we receive a response from ollama?

No, there's no connection between this and the error.

The error sounds like a bug. Is Ollama not working for you at all, or does it not work when you restore a chat from a file on disk?

luyangliuable commented 4 months ago

I get the the error when I run gptel-send with the following configurations. It also takes 10 minutes before the response arrives. I also got Response Error: nil sometimes.

(setq-default gptel-model "mistral:latest" ;Pick your default model
                   gptel-backend (gptel-make-ollama "Ollama"
                                   :host "localhost:11434"
                                   :stream t
                                   :models '("mistral")))

gptel-curl:


{"model":"mistral:latest","created_at":"2024-02-10T22:43:58.789276Z","response":"","done":true,"total_duration":417571583,"load_duration":417073333}
(8d87d74a71c4ad1eb816d1778ae4e5db . 120)


* gptel-log:

{ "gptel": "request body", "timestamp": "2024-02-11 11:13:45" } { "model": "mistral", "system": "You are a large language model living in Emacs and a helpful assistant. Respond concisely.", "prompt": "Test", "stream": true }



ollama server is also active and `ollama run mistral` works normally.
<img width="1147" alt="image" src="https://github.com/karthink/gptel/assets/23611033/5f992ddb-29ae-49f9-8178-197fabdc67f7">

zk395 commented 2 months ago

I wasn't aware of this issue so I tried to load Ollama saved state and with the new prompt it basically ignored all the previous context - I guess this is why? Is there any progress on this, or some kind of workaround I could use?

karthink commented 2 months ago

This was fixed recently, please update gptel.

On Sun, Apr 28, 2024, 3:16 AM zk395 @.***> wrote:

I wasn't aware of this issue so I tried to load Ollama saved state and with the new prompt it basically ignored all the previous context - I guess this is why? Is there any progress on this, or some kind of workaround I could use?

— Reply to this email directly, view it on GitHub https://github.com/karthink/gptel/issues/181#issuecomment-2081423195, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBVOLCTMAIWKDW4YJXDE5DY7TD77AVCNFSM6AAAAABB4CNNNGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBRGQZDGMJZGU . You are receiving this because you authored the thread.Message ID: @.***>