karthink / gptel

A simple LLM client for Emacs
GNU General Public License v3.0
1.04k stars 111 forks source link

"Last message must have role `user`" when prompt is above GPT-generated response #230

Open EugeneNeuron opened 3 months ago

EugeneNeuron commented 3 months ago

Hi @karthink Thanks for the great package, first of all.

I noticed, that it can be tricky to feed GPT with message that contains parts of previously generated response.

How to reproduce:

  1. OpenAI-compatible model (pplx in my case)
  2. prompt: "write any simple two-line function in python. no comments, no notes, no explanation"
  3. call gptel-send
  4. it writes a function that you copy somewhere. This text has "gptel: response" text property:

def two_line_function(x): return x * x

This simple function takes a number x as input and returns its square.

  1. you optionally modify the copied response. Now you treat it as your piece of text, while it still holds gptel text property.
  2. you attach you question above the code.

is this function correct?

def two_line_function(x): return y * x

  1. receive the error: "Last message must have role user.", because the the function was sent as a mix of "assistant" and "user" messages. And last message happened to be from "assistant".

{ "model": "codellama-70b-instruct", "messages": [ { "role": "system", "content": "You are a large language model living in Emacs and a helpful assistant. Respond concisely." }, { "role": "user", "content": "is this function correct?" }, { "role": "assistant", "content": "def two_line_function(x):\n return" }, { "role": "user", "content": "y" }, { "role": "assistant", "content": "* x" } ], "stream": true, "temperature": 1.0 }

  1. if you write you prompt after this mix of gpt+yours work, then the error won't happen, but I think it still can affect the quality of the response, because prompts and responses are mixed up.

{ "model": "codellama-70b-instruct", "messages": [ { "role": "system", "content": "You are a large language model living in Emacs and a helpful assistant. Respond concisely." }, { "role": "user", "content": "" }, { "role": "assistant", "content": "def two_line_function(x):\n return" }, { "role": "user", "content": "y" }, { "role": "assistant", "content": "* x" }, { "role": "user", "content": "is this function correct?" } ], "stream": true, "temperature": 1.0 }

Currently I get it over with "Refactor" option or reading the prompt from minubuffer that seems to clear this text property, and, of course, this is a fictional simplified example that more or less fits into "Refactor" use case, but the more you use responses as part of your prompt, the easier to catch that error. And as I showed in the last point, this situation with jumbled prompt can silently affect the results, without you even noticing it.

From the top of my head, we can:

karthink commented 3 months ago

@EugeneNeuron Thanks, I'm aware of this issue. This is one of the trade-offs of using text properties to demarcate responses.

Here's another (related) problem I've faced: What we should do when the user moves the cursor into a response region and starts typing? Should the new text be counted as part of the response, or should it split the response in two with a new user prompt in between? Currently I'm doing the latter, but I can make a valid argument for both approaches.

Of the two solutions you propose, I'm in favor of the first one. The second is both invasive (for obvious reasons) and ineffective, because there are many ways to copy text and we can't cover them all.

I have some ideas about how to provide this command (to clear the response text-property), but I'll think about it some more.

BTW there are already some response-specific actions available to you: try activating the transient menu when the cursor is in the middle of an LLM response.

EugeneNeuron commented 3 months ago

This is one of the trade-offs of using text properties to demarcate responses.

Personally, I think the invisibility that text properties give is worth this trade-off. But maybe this behavior should be mentioned somewhere in FAQ?

What we should do when the user moves the cursor into a response region and starts typing?

It's also present in the examples above and I tried to think this issue over with my limited knowledge on AI and, technically, there's no problem with that, because GPTs just complete after you. And that shouldn't bias the conversation a lot. Especially, given that prompt engineer can prefill assistant's responses, use stop_sequences and who knows what.

I have some ideas about how to provide this command

Glad to hear that.

karthink commented 3 months ago

technically, there's no problem with that, because GPTs just complete after you. And that shouldn't bias the conversation a lot.

Yes, I've gone back and forth a few times between the two approaches (when inserting user text in the middle of a response) and no one's noticed.

Especially, given that prompt engineer can prefill assistant's responses, use stop_sequences and who knows what.

Eventually I plan to surface an interface for these controls in gptel.