ahyatt / llm

A package abstracting llm capabilities for emacs.
GNU General Public License v3.0
142 stars 19 forks source link

Context does not get sent if list of interactions ≥2 #43

Closed hraban closed 3 months ago

hraban commented 3 months ago

I've added some debug statements and playing around with the basics I see:

(let ((p (make-llm-chat-prompt
          :context "Context")))
  (llm-chat my-provider p))
raw data sent to https://api.openai.com/v1/chat/completions: {"model":"gpt-4-turbo-preview","messages":[{"role":"system","content":"Context\n"}]}
(let ((p (make-llm-chat-prompt
          :context "Context"
          :interactions (list
                         (make-llm-chat-prompt-interaction :role 'user :content "one")))))
  (llm-chat my-provider p))
raw data sent to https://api.openai.com/v1/chat/completions: {"model":"gpt-4-turbo-preview","messages":[{"role":"system","content":"Context\n"},{"role":"user","content":"one"}]}
(let ((p (make-llm-chat-prompt
          :context "Context"
          :interactions (list
                         (make-llm-chat-prompt-interaction :role 'user :content "one")
                         (make-llm-chat-prompt-interaction :role 'assistant :content "two")
                         (make-llm-chat-prompt-interaction :role 'user :content "three")))))
  (llm-chat my-provider p))
raw data sent to https://api.openai.com/v1/chat/completions: {"model":"gpt-4-turbo-preview","messages":[{"role":"user","content":"one"},{"role":"assistant","content":"two"},{"role":"user","content":"three"}]}

Is this on purpose? It seems rather counter-intuitive but maybe I'm misunderstanding the point of context

ahyatt commented 3 months ago

This was on purpose, but I think the fact that it happens is a mistake - the thinking was the context was really just needed for the first interaction, but I don't think that's necessarily the case. I'll fix this.

On Thu, Apr 11, 2024 at 8:52 AM Hraban @.***> wrote:

I've added some debug statements and playing around with the basics I see:

(let ((p (make-llm-chat-prompt :context "Context"))) (llm-chat my-provider p))

raw data sent to https://api.openai.com/v1/chat/completions: {"model":"gpt-4-turbo-preview","messages":[{"role":"system","content":"Context\n"}]}

(let ((p (make-llm-chat-prompt :context "Context" :interactions (list (make-llm-chat-prompt-interaction :role 'user :content "one"))))) (llm-chat my-provider p))

raw data sent to https://api.openai.com/v1/chat/completions: {"model":"gpt-4-turbo-preview","messages":[{"role":"system","content":"Context\n"},{"role":"user","content":"one"}]}

(let ((p (make-llm-chat-prompt :context "Context" :interactions (list (make-llm-chat-prompt-interaction :role 'user :content "one") (make-llm-chat-prompt-interaction :role 'assistant :content "two") (make-llm-chat-prompt-interaction :role 'user :content "three"))))) (llm-chat my-provider p))

raw data sent to https://api.openai.com/v1/chat/completions: {"model":"gpt-4-turbo-preview","messages":[{"role":"user","content":"one"},{"role":"assistant","content":"two"},{"role":"user","content":"three"}]}

Is this on purpose? It seems rather counter-intuitive but maybe I'm misunderstanding the point of context

— Reply to this email directly, view it on GitHub https://github.com/ahyatt/llm/issues/43, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAE7ZAAGOMMKC6XZYMH563Y42BPRAVCNFSM6AAAAABGCH3ARCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZTONRVG44TKNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

hraban commented 3 months ago

Thanks 👍 The way I use context is to set the tone for the entire conversation, so based on that I was expecting it to be available throughout the entire convo. Basically my own version of the prompt prefix which usually gets injected by an LLM chat UI. (I'm sure openai still injects their own prefix somehow regardless of whether I supply a context or not, but...)

If that's not the point of contexts then sorry for the misunderstanding

ahyatt commented 3 months ago

I looked at the code, and I think maybe it is doing the right thing. We should store the context in the system prompt. We don't want to apply it twice, because it's already there. For example, here's something I just tried in my scratch buffer.

(setq p
      (make-llm-chat-prompt
       :context "Please respond as if you are a standard Bond villain from the films of the 1970s."
       :interactions (list (make-llm-chat-prompt-interaction :role 'user :content "Explain how Docker authentication works, and why it is needed."))))
#s(llm-chat-prompt "Please respond as if you are a standard Bond villain from the films of the 1970s." nil (#s(llm-chat-prompt-interaction user "Explain how Docker authentication works, and why it is needed." nil)) nil nil nil)

(llm-chat ash/llm-openai p)
"Ah, my dear adversary, you've stumbled into my lair, seeking the secrets of Docker authentication? How utterly predictable. You see, in the grand scheme of my operations, knowledge such as this is but a small cog in a much larger machine, designed to ensure that every container in my dominion operates under my complete control, secure from the meddling of do-gooders like yourself.

Let's commence with the basics—why is Docker authentication needed? In this ever-expanding digital world, where information is as precious as diamonds, Docker authentication serves as the vault door to one's container registry. It ensures that only those who possess the correct keys—be it credentials or tokens—can access and manipulate these container images. Without such a mechanism, my valuable assets could be exposed to anyone with nefarious intentions, much like leaving the blueprints to my underground lair unattended in a café.

Now, onto the mechanism itself. Docker authentication, much like the sophisticated security systems guarding my estates, operates on principles designed to verify the identity of users attempting to access the Docker registry. Initially, when a user or a service attempts to access a Docker registry, they must present credentials. These can be in the form of a username and password, but for automated processes, tokens or certificates are more commonly employed to avoid compromising security if, say, a rogue agent were to intercept them.

Upon receiving these credentials, the authentication system, standing sentinel over the Docker registry, verifies them with the help of an authentication server. Should the credentials be validated, the system grants access, much like how I would allow trusted lieutenants into my command center. Incorrect credentials, on the other hand, result in denial, thwarting any unauthorized attempt to meddle with my container images.

Furthermore, Docker supports various authentication mechanisms, including integration with external providers through protocols like OAuth2, allowing for a flexible and secure way to manage access that even someone of your... capabilities can appreciate.

In essence, Docker authentication is imperative for the same reason I surround myself with loyal henchmen—it ensures that only those whom I have deemed worthy can access my precious resources, protecting my operations from interference by external parties with less noble intentions.

So, my unwitting visitor, while you came seeking knowledge, remember that in the art of cybersecurity and villainy alike, knowledge is power, and securing access is paramount. Now, do enjoy your stay in my lair—though I'm afraid leaving might not be as simple as entering."

p
#s(llm-chat-prompt "Please respond as if you are a standard Bond villain from the films of the 1970s." nil (#s(llm-chat-prompt-interaction system "Please respond as if you are a standard Bond villain from the films of the 1970s.
" nil) #s(llm-chat-prompt-interaction user "Explain how Docker authentication works, and why it is needed." nil) #s(llm-chat-prompt-interaction assistant "Ah, my dear adversary, you've stumbled into my lair, seeking the secrets of Docker authentication? How utterly predictable. You see, in the grand scheme of my operations, knowledge such as this is but a small cog in a much larger machine, designed to ensure that every container in my dominion operates under my complete control, secure from the meddling of do-gooders like yourself.

Let's commence with the basics—why is Docker authentication needed? In this ever-expanding digital world, where information is as precious as diamonds, Docker authentication serves as the vault door to one's container registry. It ensures that only those who possess the correct keys—be it credentials or tokens—can access and manipulate these container images. Without such a mechanism, my valuable assets could be exposed to anyone with nefarious intentions, much like leaving the blueprints to my underground lair unattended in a café.

Now, onto the mechanism itself. Docker authentication, much like the sophisticated security systems guarding my estates, operates on principles designed to verify the identity of users attempting to access the Docker registry. Initially, when a user or a service attempts to access a Docker registry, they must present credentials. These can be in the form of a username and password, but for automated processes, tokens or certificates are more commonly employed to avoid compromising security if, say, a rogue agent were to intercept them.

Upon receiving these credentials, the authentication system, standing sentinel over the Docker registry, verifies them with the help of an authentication server. Should the credentials be validated, the system grants access, much like how I would allow trusted lieutenants into my command center. Incorrect credentials, on the other hand, result in denial, thwarting any unauthorized attempt to meddle with my container images.

Furthermore, Docker supports various authentication mechanisms, including integration with external providers through protocols like OAuth2, allowing for a flexible and secure way to manage access that even someone of your... capabilities can appreciate.

In essence, Docker authentication is imperative for the same reason I surround myself with loyal henchmen—it ensures that only those whom I have deemed worthy can access my precious resources, protecting my operations from interference by external parties with less noble intentions.

So, my unwitting visitor, while you came seeking knowledge, remember that in the art of cybersecurity and villainy alike, knowledge is power, and securing access is paramount. Now, do enjoy your stay in my lair—though I'm afraid leaving might not be as simple as entering." nil)) nil nil nil)

You can see that the context instructions are still there, and will be sent back with the prompt the next time you call the LLM.

That said, there still may be something wrong, but it isn't happening all the time, at least.

hraban commented 3 months ago

Right, maybe if you keep the original prompt around and items to it the context doesn't get cleared, I see. I'm trying to reconstruct the entire prompt statelessly from history on every interaction by passing the entire history to a new make-llm-chat-prompt, at which point I think it loses the context when your list of interactions is ≥2 items long.

ahyatt commented 3 months ago

It's important that you keep the prompt around, and re-use it. The README stresses this (but let me know if it isn't clear enough). Although this is no longer the case, Ollama once had a weird way of keeping around context, and you could never re-create it, so the prompt had to be kept around. I think it's most generalizable and future-proof to re-use the prompt and don't attempt to create it beyond the first interaction.

On Thu, Apr 11, 2024 at 11:07 PM Hraban @.***> wrote:

Right, maybe if you keep the original prompt around and items to it the context doesn't get cleared, I see. I'm trying to reconstruct the entire prompt statelessly from history on every interaction by passing the entire history to a new make-llm-chat-prompt, at which point I think it loses the context when your list of interactions is ≥2 items long.

— Reply to this email directly, view it on GitHub https://github.com/ahyatt/llm/issues/43#issuecomment-2050884143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAE7ZA4PD7EQ5BSKRXUGULY45FY7AVCNFSM6AAAAABGCH3ARCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJQHA4DIMJUGM . You are receiving this because you commented.Message ID: @.***>

hraban commented 3 months ago

I read that part, but I don't use ollama at the moment :D I'm just using openai. It's too hard to keep stateful content for my purposes and being stateless is too appealing. The benefit of an LLM I don't use being easier to plug in doesn't outweigh the price of rearchitecting my setup and losing statelessness.

ahyatt commented 3 months ago

In that case, I'd advise you to put all the context in the beginning of the first user's interaction. That's the safest way that will be compatible with all providers. If this is just for your own purposes, and you want to just use Open AI, then it should be safe to have the first interaction be a system interaction, the second to be the user's first interaction, and then any results. If you intend to publish your code as a package, I'd advise figuring out some way to have it be stateful, otherwise your users may eventually have compatibility issues.

ahyatt commented 3 months ago

After this discussion, I think there isn't an actual bug with the code. I think that even without keeping the prompt around statefully, @hraban can do something reasonable that should work with good quality. I'm going to close this, but feel free to stick around and ask questions.

hraban commented 3 months ago

I think mainly my question is what is the actual API of this package, because if e.g. supplying a context, and >1 entries in the interaction simultaneously are disallowed, is that worth documenting / raising an error? Or e.g. why does the constructor for a prompt accept a list of interactions in the first place.

ahyatt commented 3 months ago

The best documentation is the README, which does mention all of this. But, yes, this actually can be documented better and can have better errors, that's a great point. I'll re-open this bug with that as the targeted fix.

hraban commented 3 months ago

Right if it helps I spent a lot of time in C-h P llm RET and my main point of confusion was that I thought context would always be valid, regardless of other parameters, and that while you're not supposed to edit existing interaction history on a prompt, I didn't see anything about passing multiple items of interaction to a new prompt constructor. But I'm assuming this was both wrong, right?

ahyatt commented 3 months ago

Yeah, it isn't clear. Indeed, you shouldn't pass more than a single interaction to a new prompt constructor. There's a reason why it's like this: it's easiest just to let people make the prompt struct instance in the normal way that you make struct instances in elisp, just because it's possible to do and hard to stop the client from doing it, and not trivial to create a better way without repeating a lot. But I'll think about it, perhaps there's a better way.

hraban commented 3 months ago

Understood thank you. I have added an escape hatch for my purposes and prefix the content as a system prompt, and it seems to work:

(let ((p (make-llm-chat-prompt
          :interactions (list
                         (make-llm-chat-prompt-interaction :role 'system :content "Always answer in French")
                         (make-llm-chat-prompt-interaction :role 'user :content "one")
                         (make-llm-chat-prompt-interaction :role 'assistant :content "two")
                         (make-llm-chat-prompt-interaction :role 'user :content "three")))))
  (llm-chat my-provider p))

;; => "quatre"
ahyatt commented 3 months ago

@hraban PTAL and my latest change. It isn't released yet, but I think this should be sufficient for your needs, and provides a clearer and easier to use API for prompt creation.

ahyatt commented 2 months ago

@s-kostyaev as an important client of this library, before I release this change, do you have any opinions on the new method of creating prompts (llm-make-chat-prompt)?

s-kostyaev commented 2 months ago

I will look into it, thank you. First impression - LGTM. But in llm-make-chat-prompt documentation more keys than function arguments, maybe I'm missing something?

ahyatt commented 2 months ago

I will look into it, thank you. First impression - LGTM. But in llm-make-chat-prompt documentation more keys than function arguments, maybe I'm missing something?

Thanks for noting that, I had one obsolete key still in there, which I've removed.

s-kostyaev commented 2 months ago

@ahyatt can we also add num_ctx parameter or even add ability to set custom keys in options field https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion ? But not sure if it should be part of llm-make-chat-prompt, make-llm-ollama of both. I think - both.

Also, why you have 0.14.1 release and GNU Elpa only have 0.12.3?

ahyatt commented 2 months ago

@s-kostyaev I'm not sure what num_ctx does. The API docs don't seem to explain. The issue with adding custom keys is that mostly those things are part of the prompt, and therefore should be usable with different models. So we just want to support the common parameters. For things that aren't part of the prompt, those we can support.

Also, thanks for noting the discrepancy with the llm version. I'm not sure what's going on there. Sometimes there might be an issue building, but I haven't received any mail about it. I'll reach out and ask.

s-kostyaev commented 2 months ago

@s-kostyaev I'm not sure what num_ctx does. The API docs don't seem to explain. The issue with adding custom keys is that mostly those things are part of the prompt, and therefore should be usable with different models. So we just want to support the common parameters. For things that aren't part of the prompt, those we can support.

options: additional model parameters listed in the documentation for the [Modelfile]> > (https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values) such as temperature

<!DOCTYPE html> num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096

You can set different num_ctx, context window size, for one model. For example, mistral supports 32k context size, but default value is 2k. Long context is good in many cases, but spend more RAM/VRAM.

ahyatt commented 2 months ago

Thanks, that makes sense. I'll look into ways to support this in the next release after the current one I'm working on.

ahyatt commented 1 month ago

@s-kostyaev I've added a solution for this, PTAL and let me know if you think it is sufficient.

s-kostyaev commented 1 month ago

@ahyatt LGTM, thank you. Can we also add temperature and non-standart-options to provider constructor? And if set in constructor, it will be default values for chat interactions for that provider and can be overwritten. Non-standart-options should be overwritten on key by key basis. It will be useful. For example I will set temperature 0 for RAG provider.

ahyatt commented 1 month ago

@s-kostyaev it's a good idea, thank you! I've added it now, PTAL and let me know if that's what you expected.

s-kostyaev commented 1 month ago

Thank you! This is exactly what I want.

s-kostyaev commented 1 month ago

@ahyatt When it will be released?

ahyatt commented 1 month ago

I think in a few days. I'd like to just take another look at the change and especially the documentation, and then run the tests.