Closed cleesmith closed 1 month ago
I tried the 5.9.2, 5.9.3, 5.94 upgrades and the generator runtime error still occurs. This code seems to be getting close:
async def CohereResponseStreamer(prompt):
if MODEL is None: yield ""; return # sometimes model list is empty
try:
co = cohere.Client(
api_key=os.environ["CO_API_KEY"],
timeout=30
)
params = {
"model": MODEL,
"message": prompt,
}
if TEMP is not None:
params["temperature"] = TEMP
safety_unsupported_models = ["command", "command-r-03-2024", "command-r", "command-light", "command-light-nightly", "c4ai-aya-23-35b"]
if MODEL not in safety_unsupported_models:
params["safety_mode"] = "NONE"
# params["accepts"] = "text/event-stream" # doesn't help generator issue
try:
stream = co.chat_stream(**params)
for chunk in stream:
if ABORT:
set_abort(False)
yield f"\n... response stopped by button click."
stream.close()
return # exit the generator cleanly?
if chunk.event_type == "text-generation":
content = chunk.text
if isinstance(content, str):
cleaned_content = content.replace("**", "")
yield cleaned_content
else:
yield ""
except GeneratorExit:
return
except RuntimeError:
pass
I suppose the only option left is to use the Cohere models via an API key with OpenRouter ... any other ideas?
It might help to review the code here: https://github.com/openai/openai-python/blob/6172976b16821b24194a05e3e3fe5cb2342a2b4b/src/openai/lib/streaming/chat/_completions.py#L121 ... notice their comment:
This context manager ensures the response cannot be leaked if you don't read
the stream to completion.
... as currently the cohere sdk for python is leaking.
Hey @cleesmith, many thanks for this detailed report. I have shared this with our SDK generation vendor so they can upstream a fix. Please track this issue! https://github.com/fern-api/fern/issues/4817
SDK Version (required)
Describe the bug I see in the Terminal: Exception ignored in: <generator object BaseCohere.chat_stream at 0x1541109e0> ... everytime I click a button to abort the stream before it's done, which is required as sometimes models go a bit crazy and won't stop responding or respond with obvious nonsense.
I think there's a bug in chat_stream's handling of GeneratorExit or ???
The following code works properly for: Anthropic, Google/Gemini, Groq, LMStudio, Mistral, Ollama, OpenAI, OpenRouter, Perplexity, but not Cohere ...
Screenshots See attached screenshot too.