Open ericmjl opened 1 month ago
On the latest version I get this error @ericmjl - would you expect litellm to fake the streaming response ?
GroqException - Error code: 400 - {'error': {'message': 'response_format` does not support streaming', 'type': 'invalid_request_error'}}
@ishaan-jaff thinking about the problem from your perspective as a library maintainer, faking the streaming response might be good for the LiteLLM user experience but it'd also be adding a special case for you all to handle. I would love to see the streaming response faked (Groq is fast enough that for all practical purposes, just waiting for groq to return the full text is almost as good as seeing the streaming response), though I am cognizant of the extra burden it might put on you guys.
What happened?
It appears that with LiteLLM version 1.35.38 (I have not upgraded to the latest b/c of other issues with Ollama JSON mode), I am unable to use groq models with JSON mode with streaming. I have a minimal notebook that reproduces this issue on GitHub gist: https://gist.github.com/ericmjl/6f3e2cbbfcf26a8f3334a58af6a76f63
Relevant log output
Twitter / LinkedIn details
@ericmjl