gptel without conversation feature?

karthink commented 2 months ago

I thought perhaps more screenshots will help me explain what I mean. Please take a look at these three screenshots. On the left is what I see, it's all the information I have and I'm trying to guess what gptel will do. On the right you can see three different behaviors, each one will give me very different results. But I have no way of knowing what will happen just by looking at my text buffer and gptel options.

What I really expect to happen is in the first screenshot. System prompt, then the contents of my buffer up to the point, no magic processing behind the scenes, no additional hidden context.

As to the Ollama API, perhaps I misunderstand, but I don't see a problem: you may (but do not have to) pass the "context" parameter. If you don't, only what you send is used. That is exactly what I'm trying to achieve. I do not expect a conversational chat interface in my text buffer, I am looking for a predictable tool for manipulating my text.

I hope this helps clarify what I'm trying to do 🙏

Originally posted by @jwr in https://github.com/karthink/gptel/issues/249#issuecomment-2067607603

karthink commented 2 months ago

you may (but do not have to) pass the "context" parameter. If you don't, only what you send is used. That is exactly what I'm trying to achieve. I do not expect a conversational chat interface in my text buffer, I am looking for a predictable tool for manipulating my text.

Let's put aside issues with Ollama for a bit. I can explain

what gptel-send does (or is supposed to do),
why it does what it does, and
how to use gptel as a "predictable tool for manipulating your text".

What `gptel-send` does

It scans the buffer up to point (limited to the region if active), and creates a conversation record. That is, it classifies all the text into "user" and "assistant" categories.

Then it builds a JSON query containing this record, interleaving the "user" and "assistant" parts and sends it to the LLM provider.

Note that for "oneshot" uses, or if the buffer does not contain any past responses from the assistant, then there is no "conversation". So what's sent is just one user prompt.

Why `gptel-send` does this

First, gptel is primarily a LLM chat client, with the intent to provide a conversational chat interface in any text buffer. This is by design, so it's exactly what you don't want.

But gptel-send covers the "oneshot" case just fine, where you just want the answer to one question and then move on. So the same command can be used for a oneshot interaction, that can then freely turn into a multi-turn conversation. Many of the redirections available from the transient menu are effectively using gptel-send (more or less) for oneshot interactions.

Again, all of this ignores Ollama, whose API does not lend itself well to this. (This is a separate issue that needs fixing.)

If you don't want multi-turn conversations

My understanding is that you want to send all the text in a buffer without treating it as a conversation. As you may have surmised from the above, gptel-send is not what you should be using for this.

What you want is covered by the lower level gptel-request function that gptel provides, for exactly this kind of thing: you explicitly specify what you want to send, and use a callback to handle the result. All network requests in gptel (including gptel-send) are sent using gptel-request.

My suggestion is to write yourself a simple command using gptel-request that does what you want:

(defun gptel-oneshot ()
  (interactive)
  (gptel-request (buffer-substring-no-properties
                  (point-min) (point))
    :stream t))

This command uses gptel-request to send the contents of the buffer up to point, unconditionally, without regard for the "conversation" structure that gptel-send creates.

You may want to be able to limit to the active region:

(defun gptel-oneshot ()
  (interactive)
  (save-restriction
    (when (use-region-p)     
      (narrow-to-region (region-beginning) (region-end)))
    (gptel-request (buffer-substring-no-properties
                    (point-min) (point))
      :stream t)))

You can tweak the command further to do what you want. Take a look at the documentation of gptel-request -- it gives you fine control over all parameters of the request.

gptel-oneshot (or equivalent) should do exactly what you want, even with Ollama's API.

jwr commented 2 months ago

My understanding is that you want to send all the text in a buffer without treating it as a conversation.

Yes, that's very close to what I am trying to do (I don't always want all the text, sometimes just the region, but that's very close). In fact, it was so obvious to me that I assumed that that's what gptel does, hence our misunderstanding. I am still not convinced at all of the value of a conversational interface (e.g. marking some text as being a model response and that text becoming invisibly invisible) in text buffers. I often edit responses from models and send them back in again. But perhaps that's not what you intended for gptel.

I had thought about writing a different interface before. But I stopped when I realized two issues:

I do like the transient interface, especially after it became more predictable by options not being buffer-local (it still loses my directive, but this can be improved). I often switch between models and options, and the interface is great for that. It's the main reason why I'm using gptel.
gptel-request does a lot of work and there is no clear place where I could plug in
I had trouble following the code: I couldn't even find the place where the directive gets appended to the system prompt, when I wanted to prepend it to the first user message instead

What I think I'd much rather see is a "no context" option (effective for non-gptel buffers only). If enabled, it would skip the whole buffer-parsing logic and send either the region, or buffer contents up to a point. It's something I would always keep enabled. Would that be an acceptable solution for you?

I would offer to implement this and do a PR, but the changes to gptel-request would be significant.

gregoryg commented 2 months ago

@jwr if you are interested in non-chat usage without coding, I just happened to put up a video yesterday showing a couple of those use cases https://www.youtube.com/watch?v=yAL0cw1ePqw&t=26s ... @karthink gets into examples that seem more like your usage in his video as well https://www.youtube.com/watch?v=bsRnh_brggM

karthink commented 3 weeks ago

@jwr I forgot to mention it here -- I added the option to disable response tracking in non-chat buffers a couple of weeks ago. You can set gptel-track-response for this. Could you set this variable and let me know if it makes gptel work how you expect it to?

jwr commented 2 weeks ago

I have been using that feature since it first appeared in the code (I didn't actually know about the variable, I set it in the menu). It makes gptel much more useful for me: I get predictable results and I'm able to work with text comfortably. I still have problems with the volatility/changeability of gptel's directive, but this is a huge improvement. Thank you! 🙏

karthink commented 2 weeks ago

Yeah, you can set gptel-track-response to nil in your configuration once and forget about it, you don't have to set it each time from the menu.

I still have problems with the volatility/changeability of gptel's directive

This is easy to change as well, just press C-x s in the transient menu to save all your settings. Then the directive will be applied whenever you open the menu.

I've mentioned more permanent solutions for persistent transient options (including the directive) on the wiki and in the other issue.

jwr commented 2 weeks ago

Yes, I realize that I can use C-x s to save the options, but I still find it jarring that gptel forgets some things if I don't do that. It is especially visible if I use I or J to inspect the generated query — I check the query, decide that it's OK, press q to get rid of the query buffer, and the next time I bring up gptel, the setup is different.

I can see that some options are kept between invocations, but some are not. For example, changing the model or temperature "sticks", but output options or the directive do not and get reset every time. This is the volatility/changeability that I mentioned.

I think what I mentally expect is that if I bring up the gptel menu, change some options, and bring it up again, everything will be exactly the same as the last time I saw the menu. In other words, I expect to be able to bring it up and quit it as I wish without worrying and checking what changed since the last time I saw the menu.

karthink commented 2 weeks ago

Closing since this feature has been implemented (Customize gptel-track-response or use gptel's transient menu.)

karthink / gptel