Closed mohittalele closed 3 months ago
able to repro
this is caused because the finish_reason is given but additional chunks are still being yielded, causing multiple calls
if ( "async_complete_streaming_response" in self.model_call_details ): await callback.async_log_success_event( kwargs=self.model_call_details, response_obj=self.model_call_details[ "async_complete_streaming_response" ], start_time=start_time, end_time=end_time, )
complete_streaming_response = None if self.stream: if result.choices[0].finish_reason is not None: # if it's the last chunk self.streaming_chunks.append(result)
verbose_logger.debug(f"final set of received chunks: {self.streaming_chunks}")
try: complete_streaming_response = litellm.stream_chunk_builder(
What happened?
I am trying to test functionality of callbacks -
here is my simple fastAPI server setup -
custom_callback.py
FastAPI server -
here is sample curl request
async_log_success_event
function is called twice whenI would expect it to be called once. If I put
stream_options = None
its called one time. Am I missing something ?Relevant log output
No response
Twitter / LinkedIn details
No response