Bug ? claude3 responses seems to be cut after a few words/tokens.

alexisdal commented 6 months ago

I wanted to evaluate claude3 opus, their latest biggest model that just ranked #3 in the chatbot arena ranking. the closest to gpt4-turbo to date.

So I just added some credits on console then created a new claude api key that I pasted in chatbotui to try claude3 opus. I used the default settings. but the first answer was immediately cut in the middle of the sentence. that reminded me of the default max_length on the openai playground ridiculously limited to 256 tokens. But I didn't find this param in the settings at all. I just increased the context lenght to 200k tokens, just in case (but litkely unrelated) and started a new chat on claude 3 opus. screenshot below.

Yet the answers seems to be cut-off / chopped. I do not know if

I'm not using chatbotui right (I'm still on the free plan on chatbotui.com, maybe that's related?)
i'm supposed to configure chatbotui settings somehow and I missed it
there's something wrong with the settings on my anthropic account (not enough money spent, too low credits)

=> in other words, I do not know what I'm doing wrong, but clearly this is not an expected behavior of opus 3.

alexisdal commented 6 months ago

for comparison: how claude3 opus behaves in the anthropic console/playground. I used the very same prompts. notice the default settings on the left

temperature is on a scale 0->1 (and defaults to zero). whereas openai temperature rnages from 0 to 2 and defaults to 1.

max tokens to samples = max length of one claude response = defaults to 1000

it's visible that claude 3 answer is appropriate, correct and not cut there.

ScrogBot commented 6 months ago

I’ve noticed that with Claude as well. I respond with “continue” and it finishes.

rmkr commented 6 months ago

Continue doesn't work for me. My API key does work on TypingMind. I do have ChatbotUI Plus so I don't think its related to your account level. There is also twitter thread where @mckaywrigley is talking about this and relates it to the type of account. (Evaluator Accounts).

alexisdal commented 6 months ago

I'm fine throwing a few $$ to buy extra credit (instead of the free $5 that were awarded on my account and that were used in the screenshots above). but I need someone to confirm whether or not it's acutally usefull so that I dont throw money out the window => indeed, I do not understand why the playground/console (that works with API) would give a different response compared to chatbotui that leverages also the API access.... 🤔

rmkr commented 6 months ago

I have ~25 dollars worth of credit, plus the 5, so I don't think it's that. I am on Build Tier 1 which is one above the free. It's either they are limiting the tokens coming out or something wrong the the query. This is just speculation.

alexisdal commented 6 months ago

instead, I just tried to execute the same system / user prompt using a bare minimum python script (copy/paste from anthropic console)

For me, anthropic anser is not stripped off. so it's smells like something from chatbotui

(.venv) C:\Users\alex\Desktop\test_claude3>nano .env

(.venv) C:\Users\alex\Desktop\test_claude3>cat run.py
from dotenv import load_dotenv
load_dotenv()

import anthropic

client = anthropic.Anthropic(
    # defaults to os.environ.get("ANTHROPIC_API_KEY")
    # api_key="my_api_key",
)
message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1000,
    temperature=0,
    system="You are a friendly, helpful AI assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Hi Opus, Is that true that you're really super charged compared to other anthropic AI models? what are the other anthropic models and how those models compare? explain me"
                }
            ]
        }
    ]
)
print(message.content)

(.venv) C:\Users\alex\Desktop\test_claude3>python run.py
[ContentBlock(text='I appreciate your interest, but I\'m actually not sure how I compare to other Anthropic AI models in terms of capabilities. I know I was created by Anthropic, but I don\'t have detailed information about their other models or my exact capabilities relative to them. \n\nI aim to be helpful and to engage in friendly conversation, but I think it\'s best if I avoid making strong claims about being "super charged" or superior to other AIs, since I\'m quite uncertain about that. I\'m an AI with significant capabilities in many areas, but also important limitations.\n\nRather than comparing myself to other AIs, I think it\'s better to focus on how I can be helpful to you in our conversation. Please let me know if there are any topics you\'d like to discuss or ways I can assist you!', type='text')]

(.venv) C:\Users\alex\Desktop\test_claude3>

(.venv) C:\Users\alex\Desktop\test_claude3>pip list
Package            Version
------------------ --------
annotated-types    0.6.0
anthropic          0.19.1
anyio              4.3.0
certifi            2024.2.2
charset-normalizer 3.3.2
colorama           0.4.6
distro             1.9.0
filelock           3.13.1
fsspec             2024.2.0
h11                0.14.0
httpcore           1.0.4
httpx              0.27.0
huggingface-hub    0.21.4
idna               3.6
packaging          23.2
pip                24.0
pydantic           2.6.3
pydantic_core      2.16.3
python-dotenv      1.0.1
PyYAML             6.0.1
requests           2.31.0
sniffio            1.3.1
tokenizers         0.15.2
tqdm               4.66.2
typing_extensions  4.10.0
urllib3            2.2.1

(.venv) C:\Users\alex\Desktop\test_claude3>

for the record, I just took a fresh venv with python 3.12 and did a pip install anthropic python-dotenv (to store my API key in a separate .env file) => all other libs are dependancies

alexisdal commented 6 months ago

this is what I suspect is happening chatbotui strips off claude3 responses upon clinking on the send button, then entire context is sent back to claude, who then thinks it cut its own responses and then claude tries to apologize for cutting its own answers, which it did not do, but it's what it looks like given the dict sent in the API request, and then the new response it eventually cut off again by chatbotui => and the loop basically continues

making claude-3-opus-20240229 model (and maybe other claude / anthropic models) totally useless in chatbotui ?

just an hypothesis, but I can't think of another one that would better explain the behaviors and facts collected until now.

rmkr commented 6 months ago

They could also be limiting responses to the IP of the ChatBotUI server. This could be proved by hosting ChatBot locally and seeing if the response is still cut.

rmkr commented 6 months ago

I confirmed that if I host locally. I do not have issue with Claude, it may be a limiting issues with Claude to the server.

tysonvolte commented 6 months ago

Still debugging this but debug console says pipeline broken when this occurs for me. I'll try to look into this tomorrow.

tysonvolte commented 6 months ago

Solved it: it's due to Vercel's MaxDuration timeout (15 seconds if you check the logs). However, setting the API endpoint to use Edge runtime fixes the issue.

// export const runtime = "edge"

Might've been an accidental oversight but uncommenting this line in the anthropic endpoint fixes the issue.

More information on why it happens:

https://vercel.com/docs/functions/configuring-functions/duration

lgruen commented 6 months ago

@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case.

tysonvolte commented 6 months ago

@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case.

Can you check your Vercel logs and paste what error you're getting?

spammenotinoz commented 6 months ago

Your milage may vary, but I found the same issue occurred via the Workbench for Sonnet and Haiku. Opus worked fine. Put $5 on the account then they all would complete the queries correctly in the Workbench, Still have the problem in ChatBot-UI.

Thanks @tysonvolte's however removing the // seemed to break my deployment, ie: only a single word was returned.

Based on the 10second timeout, possibly running as a serverless function not an edge function which is where @tysonvolte was heading, but didn't work for me.

spammenotinoz commented 6 months ago

this is what I suspect is happening chatbotui strips off claude3 responses upon clinking on the send button, then entire context is sent back to claude, who then thinks it cut its own responses and then claude tries to apologize for cutting its own answers, which it did not do, but it's what it looks like given the dict sent in the API request, and then the new response it eventually cut off again by chatbotui => and the loop basically continues

making claude-3-opus-20240229 model (and maybe other claude / anthropic models) totally useless in chatbotui ?

just an hypothesis, but I can't think of another one that would better explain the behaviors and facts collected until now.

That's how all API models work, that's why token counts escalate quickly during a conversation, they are compounding. If you pasted say 4k tokens in, and it sends 4k back, on your next prompt you send 8k back plus your new prompt... and this goes on. I've put a cap it to only include history for the last 10 messages, but a better way would be to use a model to summarise \ rationalise the previous messages\responses.

spammenotinoz commented 5 months ago

@tysonvolte @lgruen @rmkr @alexisdal @mckaywrigley

Try this fix https://github.com/mckaywrigley/chatbot-ui/pull/1571

BugFix: Bump Anthropic SDK to 0.18.0, resolve parsing issue. Improve error handling Revise Anthropic to run on the edge, resolve timeout issue.

Please note, that I can not code, but this seemed to consistently fix the issue on my end.

lgruen commented 5 months ago

@spammenotinoz That worked perfectly! Thank you very much.

@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case.

Can you check your Vercel logs and paste what error you're getting?

I had the same issue as @spammenotinoz described in https://github.com/mckaywrigley/chatbot-ui/issues/1543#issuecomment-2002431147. For reference, here are the Vercel logs (before @spammenotinoz's fix in https://github.com/mckaywrigley/chatbot-ui/pull/1571):

Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
From chunk: [ 'event: content_block_delta' ]
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
Could not parse message into JSON:
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39) at (node_modules/ai/dist/index.mjs:417:19) at (node_modules/ai/dist/index.mjs:298:30)
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39) at (node_modules/ai/dist/index.mjs:417:19) at (node_modules/ai/dist/index.mjs:298:30)
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39) at (node_modules/ai/dist/index.mjs:417:19) at (node_modules/ai/dist/index.mjs:298:30)
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input

tysonvolte commented 5 months ago

Yeah turns out it wasn't just that line, I'd already upped Anthropic's ver for a separate issue which is why it seemed like that fixed it the problem. Great work y'all!

mlovic commented 5 months ago

I had the same issue, also fixed by #1571. Thank you @spammenotinoz! Hope it gets merged soon.

mckaywrigley commented 5 months ago

Merged and fixed. @spammenotinoz you are a hero.

mckaywrigley / chatbot-ui

Bug ? claude3 responses seems to be cut after a few words/tokens. #1543