Confirm this is an issue with the Python library and not an underlying OpenAI API
[X] This is an issue with the Python library
Describe the bug
base_url: links fastchat with the openai API
LLM for inference: "mistralai/Mixtral-8x7B-Instruct-v0.1"
When I try to stream chunks of generated text I get this error:
Traceback (most recent call last):
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
yield
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_transports/default.py", line 113, in __iter__
for part in self._httpcore_stream:
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 367, in __iter__
raise exc from None
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 363, in __iter__
for part in self._stream:
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 349, in __iter__
raise exc
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 341, in __iter__
for chunk in self._connection._receive_response_body(**kwargs):
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 210, in _receive_response_body
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 220, in _receive_event
with map_exceptions({h11.RemoteProtocolError: RemoteProtocolError}):
File "/home/philip/miniconda3/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/philip/opena.py", line 19, in <module>
for chunk in stream:
File "/home/philip/miniconda3/lib/python3.12/site-packages/openai/_streaming.py", line 46, in __iter__
for item in self._iterator:
File "/home/philip/miniconda3/lib/python3.12/site-packages/openai/_streaming.py", line 58, in __stream__
for sse in iterator:
File "/home/philip/miniconda3/lib/python3.12/site-packages/openai/_streaming.py", line 50, in _iter_events
yield from self._decoder.iter_bytes(self.response.iter_bytes())
File "/home/philip/miniconda3/lib/python3.12/site-packages/openai/_streaming.py", line 280, in iter_bytes
for chunk in self._iter_chunks(iterator):
File "/home/philip/miniconda3/lib/python3.12/site-packages/openai/_streaming.py", line 291, in _iter_chunks
for chunk in iterator:
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_models.py", line 829, in iter_bytes
for raw_bytes in self.iter_raw():
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_models.py", line 883, in iter_raw
for raw_stream_bytes in self.stream:
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_client.py", line 126, in __iter__
for chunk in self._stream:
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_transports/default.py", line 112, in __iter__
with map_httpcore_exceptions():
File "/home/philip/miniconda3/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/home/philip/miniconda3/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
Note when I dont stream stream = False I don't get any errors and I get my inference.
To Reproduce
Simply run the code below, using the LLM I used, as well as python and openai versions and so on.
Code snippets
from openai import OpenAI
client = OpenAI(
# This is the default and can be omitted
api_key="myKey",
base_url="https://base_url.io/v1"
)
stream = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Say this is a test",
}
],
model="mistralai/Mixtral-8x7B-Instruct-v0.1",
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
base_url: links fastchat with the openai API LLM for inference: "mistralai/Mixtral-8x7B-Instruct-v0.1"
When I try to stream chunks of generated text I get this error:
Note when I dont stream
stream = False
I don't get any errors and I get my inference.To Reproduce
Simply run the code below, using the LLM I used, as well as python and openai versions and so on.
Code snippets
OS
linux
Python version
Python 3.11.8
Library version
openai v.1.24.0