Closed mikutsky closed 5 months ago
Hi @mikutsky. Thanks for reporting this. Can you share any predictions for these? (Go to your replicate.com Dashboard, look under Predictions). Seeing that would help us tell if the problem is in the model or the client library.
Hi, have the same issue
@Gusakovskyi @mikutsky We've confirmed that there's an issue with stop sequences for meta/meta-llama-3-70b-instruct
, and we're working on a fix.
Hi @mikutsky. Thanks for reporting this. Can you share any predictions for these? (Go to your replicate.com Dashboard, look under Predictions). Seeing that would help us tell if the problem is in the model or the client library.
It looks like the client library problem. I provide you second query info. Because the next queries collect mistakes in the prompt.
Everything looks correct on the dashboard:
Here is the prompt for the second query, and the prompt is still correct:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant. Answer briefly!<|eot_id|><|start_header_id|>user<|end_header_id|>
Hi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Hi! How can I help you today?<|eot_id|><|start_header_id|>user<|end_header_id|>
I read it, please recommend something else.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
However console output contains the extra tag 'Hi':
You: >? Hi!
Assistant: Hi! How can I help you today?
You: >? I read it, please recommend something else.
Assistant: Hi
I'd be happy to! However, I need a bit more information. What type of content are you in the mood for? A book, article, podcast, or something else?
@mikutsky We just pushed a new build of the model, which should address the stop sequence problem. Please give your client code another try and let me know if that's working for you now.
If not, could you please try calling replicate.stream
in isolation? I'd like to rule out the use of input
and accessing mutable state in a loop, even though that should be running synchronously and not be a problem.
Actually, I'm able to reproduce this in isolation, so it does appear to be an issue with the client. Working on a fix now.
Hi! I'm running into a problem of repeating the first token in subsequent requests using a stream. The prompt structure follows the Meta LLama3 documentation. Could you explain why is this going on?
Simple chat example output looks in this way:
Example code:
Thanks for your help!