Closed jordantgh closed 1 year ago
Thanks for reporting this, I'll look into this soon, as it's definitely not intended.
Also, the newer title method using langchain tries to limit the entire title prompt to under 100 characters or less, regardless of the size of the prompt, because it only uses snippets, but I can also update the older method to behave the same way, which is a fallback. It may be that the fallback method is being used here, maybe a symptom of openrouter or the particular model you selected. I also use openrouter so I can test with your exact configuration.
Thanks for reporting this, I'll look into this soon, as it's definitely not intended.
Also, the newer title method using langchain tries to limit the entire title prompt to under 100 characters or less, regardless of the size of the prompt, because it only uses snippets, but I can also update the older method to behave the same way, which is a fallback. It may be that the fallback method is being used here, maybe a symptom of openrouter or the particular model you selected. I also use openrouter so I can test with your exact configuration.
No worries. My PR is there if you want to take a look. It is quite a simplistic fix. It's my first ever PR in a repo other than my own, so apologies if it isn't up to scratch.
No worries. My PR is there if you want to take a look. It is quite a simplistic fix. It's my first ever PR in a repo other than my own, so apologies if it isn't up to scratch.
Any PR is a good PR in my book! Even when it's not up to scratch its an opportunity to code review and invite conversation. Thanks for doing that. I plan to approve and do the bit of work to minimize token usage further for titling while also making the model configurable and the whole feature optional
Contact Details
No response
What happened?
I originally noticed the issue when sending prompts via
gpt-4-32k
. I could see in my API usage that I was getting doubled charged from this API for every new chat I would start. It was clear that the second charge was due to the use ofgpt-4-32k
to choose the title. Now I begrudge paying $0.5-1 for a single prompt anyway, but doubling it for the chat title is too much. I originally thought this was intended behaviour as to deal with very large prompts that don't fit in thegpt-3.5-turbo
context. (Even so, my preferred behaviour in this instance is to leave the chat untitled if it is too big to fit). However, I noticed that it will usegpt-4-32k
for titling even with a small prompt, so I figured something is wrong that it is always using the active chat model.I chased the bug down to the following source.
OpenAIClient.titleConvo()
tries to passgpt-3.5-turbo-0613
(orgpt-3.5-turbo
) tosetOptions()
(viasendPayload()
). ButsetOptions()
fails to update the class property appropriately, code here:If
this.modelOptions
already exists (which it will once any message is sent I assume), it will not be updated. That means the model used for titling cannot differ from the model used for chatting, which seems suboptimal. To fix this I will be submitting a PR with a small change to add an else clause that simply updates the model. This is a bit of hack and doesn't fix the wider issue of letting the user decide what model to use, and for very longgpt-4-32k
prompts (there will be an error and the chat will remain untitled in that case). But it will prevent people getting a fright from much larger than expected API costs, which seems most pressing.Steps to Reproduce
gpt-3.5-turbo-*
family.What browsers are you seeing the problem on?
Chrome
Relevant log output
No response
Screenshots
Code of Conduct