"Last message must have role `user`" when prompt is above GPT-generated response

EugeneNeuron commented 3 months ago

Hi @karthink Thanks for the great package, first of all.

I noticed, that it can be tricky to feed GPT with message that contains parts of previously generated response.

How to reproduce:

OpenAI-compatible model (pplx in my case)
prompt: "write any simple two-line function in python. no comments, no notes, no explanation"
call gptel-send
it writes a function that you copy somewhere. This text has "gptel: response" text property:

def two_line_function(x): return x * x

This simple function takes a number x as input and returns its square.

you optionally modify the copied response. Now you treat it as your piece of text, while it still holds gptel text property.
you attach you question above the code.

is this function correct?

def two_line_function(x): return y * x

receive the error: "Last message must have role user.", because the the function was sent as a mix of "assistant" and "user" messages. And last message happened to be from "assistant".

{ "model": "codellama-70b-instruct", "messages": [ { "role": "system", "content": "You are a large language model living in Emacs and a helpful assistant. Respond concisely." }, { "role": "user", "content": "is this function correct?" }, { "role": "assistant", "content": "def two_line_function(x):\n return" }, { "role": "user", "content": "y" }, { "role": "assistant", "content": "* x" } ], "stream": true, "temperature": 1.0 }

if you write you prompt after this mix of gpt+yours work, then the error won't happen, but I think it still can affect the quality of the response, because prompts and responses are mixed up.

{ "model": "codellama-70b-instruct", "messages": [ { "role": "system", "content": "You are a large language model living in Emacs and a helpful assistant. Respond concisely." }, { "role": "user", "content": "" }, { "role": "assistant", "content": "def two_line_function(x):\n return" }, { "role": "user", "content": "y" }, { "role": "assistant", "content": "* x" }, { "role": "user", "content": "is this function correct?" } ], "stream": true, "temperature": 1.0 }

Currently I get it over with "Refactor" option or reading the prompt from minubuffer that seems to clear this text property, and, of course, this is a fictional simplified example that more or less fits into "Refactor" use case, but the more you use responses as part of your prompt, the easier to catch that error. And as I showed in the last point, this situation with jumbled prompt can silently affect the results, without you even noticing it.

From the top of my head, we can:

add an option to clear the region from this property. It requires a conscious decision from the user, but at least they know that such thing can happen and it will be their responsibility to avoid it.
clear the copied response from this property. Pros: If the user has copied it, then it's 100% their part of the prompt, not gptel's. Cons: It will require to advice or add hooks to built-in functions, like kill-ring-save. Maybe we can add a custom option to let user to opt-in this behavior.

karthink commented 3 months ago

@EugeneNeuron Thanks, I'm aware of this issue. This is one of the trade-offs of using text properties to demarcate responses.

Here's another (related) problem I've faced: What we should do when the user moves the cursor into a response region and starts typing? Should the new text be counted as part of the response, or should it split the response in two with a new user prompt in between? Currently I'm doing the latter, but I can make a valid argument for both approaches.

Of the two solutions you propose, I'm in favor of the first one. The second is both invasive (for obvious reasons) and ineffective, because there are many ways to copy text and we can't cover them all.

I have some ideas about how to provide this command (to clear the response text-property), but I'll think about it some more.

BTW there are already some response-specific actions available to you: try activating the transient menu when the cursor is in the middle of an LLM response.

EugeneNeuron commented 3 months ago

This is one of the trade-offs of using text properties to demarcate responses.

Personally, I think the invisibility that text properties give is worth this trade-off. But maybe this behavior should be mentioned somewhere in FAQ?

What we should do when the user moves the cursor into a response region and starts typing?

It's also present in the examples above and I tried to think this issue over with my limited knowledge on AI and, technically, there's no problem with that, because GPTs just complete after you. And that shouldn't bias the conversation a lot. Especially, given that prompt engineer can prefill assistant's responses, use stop_sequences and who knows what.

I have some ideas about how to provide this command

Glad to hear that.

karthink commented 3 months ago

technically, there's no problem with that, because GPTs just complete after you. And that shouldn't bias the conversation a lot.

Yes, I've gone back and forth a few times between the two approaches (when inserting user text in the middle of a response) and no one's noticed.

Especially, given that prompt engineer can prefill assistant's responses, use stop_sequences and who knows what.

Eventually I plan to surface an interface for these controls in gptel.

karthink / gptel

"Last message must have role `user`" when prompt is above GPT-generated response #230