run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
37.09k stars 5.32k forks source link

[Bug]: End of response stream is of type 'NoneType' using OpenAILike model, resulting in an TypeError #11289

Closed tammenha closed 5 months ago

tammenha commented 9 months ago

Bug Description

When generating a response stream (i.e. with print_response_stream()), i get an TypeError as the generation of the response concludes. The response is fully generated at that point, but I still get

TypeError: can only concatenate str (not "NoneType") to str

For details see the logs below.

I'm using Llama-Index with LocalAI, but not sure if thats related.

These are the models I used: https://huggingface.co/TheBloke/em_german_leo_mistral-GGUF https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

Version

0.10.10

Steps to Reproduce

from llama_index.llms.openai_like import OpenAILike
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings

LOCALAI_DEFAULTS = {
    "api_key": "localai_fake",
    "api_type": "localai_fake",
    "api_base": f"http://localhost:8080/v1",
}

documents = SimpleDirectoryReader("data").load_data()
Settings.llm = OpenAILike(model="<model>", timeout=60.0, **LOCALAI_DEFAULTS)
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("<Insert a question>")
response.print_response_stream()

Relevant Logs/Tracbacks

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[20], line 3
      1 query_engine = index.as_query_engine(streaming=True)
      2 response = query_engine.query("<Here stands a question>")
----> 3 response.print_response_stream()

File ~/chatbot/chatbot-env/lib/python3.11/site-packages/llama_index/core/base/response/schema.py:122, in StreamingResponse.print_response_stream(self)
    120 if self.response_txt is None and self.response_gen is not None:
    121     response_txt = ""
--> 122     for text in self.response_gen:
    123         print(text, end="", flush=True)
    124         response_txt += text

File ~/chatbot/chatbot-env/lib/python3.11/site-packages/llama_index/core/llms/llm.py:71, in stream_completion_response_to_tokens.<locals>.gen()
     70 def gen() -> TokenGen:
---> 71     for response in completion_response_gen:
     72         yield response.delta or ""

File ~/chatbot/chatbot-env/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py:224, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict.<locals>.wrapped_gen()
    222 def wrapped_gen() -> CompletionResponseGen:
    223     last_response = None
--> 224     for x in f_return_val:
    225         yield cast(CompletionResponse, x)
    226         last_response = x

File ~/chatbot/chatbot-env/lib/python3.11/site-packages/llama_index/llms/openai/base.py:441, in OpenAI._stream_complete.<locals>.gen()
    439 else:
    440     delta = ""
--> 441 text += delta
    442 yield CompletionResponse(
    443     delta=delta,
    444     text=text,
    445     raw=response,
    446     additional_kwargs=self._get_response_token_counts(response),
    447 )

TypeError: can only concatenate str (not "NoneType") to str
dosubot[bot] commented 9 months ago

🤖

Hey there @tammenha! Great to see you back. Hope you're doing well.

Based on the context provided, it seems like you're encountering a similar issue to what was reported in these two issues: Issue #11029 and Issue #9549.

The TypeError you're encountering is likely due to the fact that you're trying to concatenate a string with a None value. In Python, you cannot concatenate a string with None directly. This error could be occurring in the ChatMessage objects where content=None. If you're trying to concatenate the content of these messages with a string elsewhere in your code, you'll encounter this error.

To fix this, you need to ensure that you're not trying to concatenate None with a string. You could add a check before concatenating to see if the value is None and handle it appropriately. For example:

if content is not None:
    result = some_string + content
else:
    result = some_string

The issue #11029 was fixed in llama-index-llms-vllm=0.1.4. However, you mentioned that you're using LlamaIndex version 0.10.10. The latest version of LlamaIndex is v0.10.11. It's possible that the issue you're encountering has been fixed in the latest version. I would recommend updating to the latest version and see if the issue persists.

If the issue still persists after updating, could you please provide more information about the context in which you're trying to concatenate a string with the content of a ChatMessage? This would help in providing a more precise solution.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
logan-markewich commented 9 months ago

lame, why is localai sending None in deltas during streaming lol

will need to update the OpenAI LLM class to fix

wey-gu commented 5 months ago

@logan-markewich I am still suffering from this with OpenAI-Like with LiteLLM--Ollama--Mixtral:8x7B today, and thus with this change https://github.com/run-llama/llama_index/pull/14365.