karthink / gptel

A simple LLM client for Emacs
GNU General Public License v3.0
1.04k stars 111 forks source link

(Anthropic) Directive not being sent #276

Closed jwr closed 3 months ago

jwr commented 3 months ago

I think I found out why my directives weren't having much effect with Anthropic: it seems they are not in fact appended to the system message. I've spent an hour looking at the code and couldn't find where the directive gets appended. The "Dry run" options ("I"/"J") do not show what actually gets sent, but append the directive to the system message themselves. And the Anthropic code doesn't seem to do anything with the directive.

If I may suggest: the directive needs to be prepended to the first user message with Anthropic, or we should have an option. Appending to the system message is not a good general solution.

This is related to #249

karthink commented 3 months ago

Fixed, thank you.

I've spent an hour looking at the code and couldn't find where the directive gets appended.

Sorry for the frustration, the Anthropic API is different enough from OpenAI's that I made an error when unifying the interfaces. For future reference, it's better to run (setq gptel-log-level 'info) and look at the *gptel-log* buffer, since that's a record of exactly what was sent/received. The dry-run options (I/J) involve a bit of simulation.

karthink commented 3 months ago

If I may suggest: the directive needs to be prepended to the first user message with Anthropic, or we should have an option. Appending to the system message is not a good general solution.

Why do you think they should be prepended?

jwr commented 3 months ago

Sorry for the frustration

Don't be sorry, I'm happy to have helped solve a real issue 🙂 The outcome is good for everyone!

If I may suggest: the directive needs to be prepended to the first user message with Anthropic, or we should have an option. Appending to the system message is not a good general solution. Why do you think they should be prepended?

According to Anthropic documentation:

"in general, you can think about system prompts as a space to provide guidance about the overall interaction with Claude, and the user turn as part of the interaction itself, or when you have only a one-off task you want to accomplish"

That's how I work: system prompt defines the context, while the user prompt sends the specific directive and the text to be operated on.

This isn't clear-cut and is definitely up for debate, but I think we should at least have the option.

karthink commented 3 months ago

"in general, you can think about system prompts as a space to provide guidance about the overall interaction with Claude, and the user turn as part of the interaction itself, or when you have only a one-off task you want to accomplish"

That's how I work: system prompt defines the context, while the user prompt sends the specific directive and the text to be operated on.

I understand this, but there's no mention of the difference between appending or prepending the additional directive to the system prompt. The additional directive is an idea we made up -- in both cases it is included with the full system prompt (and not with the "user turn")

My apologies, I misread your point entirely.

If I may suggest: the directive needs to be prepended to the first user message with Anthropic, or we should have an option. Appending to the system message is not a good general solution.

I read this as "needs to be prepended to the system prompt" (as opposed to "appended to the system prompt").