langchain_openai reports error `role Input should be a valid string` when connecting to specific LLM API

Moskize91 commented 1 week ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_openai import ChatOpenAI

api_key = "XXXXXXXX"
model_name="gpt-3.5-turbo"
base_url = "https://XXXXXX" # specific LLM vendor
llm = ChatOpenAI(
    api_key=api_key,
    base_url=base_url
    model=model_name,
)
llm.invoke("hello world")

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/server/llm/node.py", line 320, in _invoke
    resp_content = chain.invoke(
                   ^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 3022, in invoke
    input = context.run(step.invoke, input, config, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 286, in invoke
    self.generate_prompt(
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 786, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 643, in generate
    raise e
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 633, in generate
    self._generate_with_cache(
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 851, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 685, in _generate
    return self._create_chat_result(response, generation_info)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 722, in _create_chat_result
    message = _convert_dict_to_message(res["message"])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 159, in _convert_dict_to_message
    return ChatMessage(content=_dict.get("content", ""), role=role, id=id_)  # type: ignore[arg-type]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/messages/base.py", line 76, in __init__
    super().__init__(content=content, **kwargs)
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/langchain_core/load/serializable.py", line 111, in __init__
    super().__init__(*args, **kwargs)
  File "/Users/taozeyu/codes/github.com/moksize91/llm-inception/.venv/lib/python3.12/site-packages/pydantic/main.py", line 213, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for ChatMessage
role
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type

Description

This error only occurs with a specific vendor. The vendor provides services in the Open AI API format, but some details are not completely consistent with Open AI.

Here is the code:

https://github.com/langchain-ai/langchain/blob/2197958366b527df23adbf12bf6578e4cd5e002c/libs/partners/openai/langchain_openai/chat_models/base.py#L109-L158

As you can see, LangChain will get the role field for the _dict content returned by the vendor server and pass it into the if-else block for processing. After my test, in the reproduction code I provided, if the request is sent to the real OpenAI, the value of the role in the _dict will be assistant. But if the request is sent to certain specific vendors, the value of role may be None.

Therefore, the final code will enter the last else branch, namely: ChatMessage(content=_dict.get("content", "").

Unfortunately, ChatMessage will detect the format of role in __init__() and require it to be a str. But since the value of role is None, this request will fail 100%.

I noticed that the role of some LLM suppliers does not contain any value. Of course, the API of these suppliers is not well designed. But if LangChain can be properly compatible with them, it will be perfect. Just add a judgment to this code. If role has no value, let it be assistant by default, so that the API of this supplier can be handled smoothly.

I noticed that the project https://github.com/SillyTavern/SillyTavern can handle this supplier's API correctly, but LangChain cannot handle it so far. SillyTavern uses the method I mentioned, so the request will not crash.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 24.0.0: Tue Sep 24 23:39:07 PDT 2024; root:xnu-11215.1.12~1/RELEASE_ARM64_T6000 Python Version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:51:49) [Clang 16.0.6 ]

Package Information

langchain_core: 0.3.9 langchain: 0.3.2 langchain_community: 0.3.1 langsmith: 0.1.132 langchain_anthropic: 0.2.3 langchain_openai: 0.2.2 langchain_text_splitters: 0.3.0

Optional packages not installed

langgraph langserve

Other Dependencies

aiohttp: 3.10.9 anthropic: 0.35.0 async-timeout: 4.0.3 dataclasses-json: 0.6.7 defusedxml: 0.7.1 httpx: 0.27.2 jsonpatch: 1.33 numpy: 1.26.4 openai: 1.51.1 orjson: 3.10.7 packaging: 24.1 pydantic: 2.9.2 pydantic-settings: 2.5.2 PyYAML: 6.0.2 requests: 2.32.3 requests-toolbelt: 1.0.0 SQLAlchemy: 2.0.35 tenacity: 8.5.0 tiktoken: 0.8.0 typing-extensions: 4.12.2

kodychik commented 1 week ago

Can I work on this issue? Thank you

keenborder786 commented 5 days ago

@Moskize91 see I have patched a fix, default the role to assistant, if not provided in the vendor message.

eyurtsev commented 5 days ago

I'm not convince that we want to do this unless there's a bit more evidence that this is an issue with many LLM providers. This could potentially lead to other subtle bugs down the road.

Would consider patching if there's enough support from the community + some examples of LLM providers that fail.

cc @efriis

Moskize91 commented 5 days ago

@keenborder786 https://github.com/langchain-ai/langchain/pull/27398 great, if this PR can be merged, it will solve my problem.

Moskize91 commented 5 days ago

@eyurtsev Actually, when I spent half an hour locating the crash of LangChain at these lines of code and determined that the vendor provided the wrong response, my first reaction was that it was the vendor's fault. So I fully understand what you said, "I don't love changing correct code into incorrect code just b/c some provider got the wrong format specification.".

But later I found that SillyTavern was actually compatible with this vendor that provided the error, so I went to check SillyTavern's code to see how its implementation was different from LangChain.

https://github.com/SillyTavern/SillyTavern/blob/ba6f7b7a98cf7a5eaf4f0e81da9779a9a668ced4/public/script.js#L5307-L5326

In fact, SillyTavern did nothing, it just didn't read the role field. If it doesn't read, it won't crash, it's that simple. I also agree that LangChain should not deliberately be compatible with vendor errors, but this does not mean that it will crash immediately when it finds that the vendor has made an error. LangChain can just pretend that it didn't see the error and let the code run.

In fact, as a user (such as me), when I encounter this problem, there is no way to get around it. In fact, I don't care about the role field at all. Because who else can the role returned by LLM be except "assistant"? But LangChain will crash on a problem that I don't care about.

For other LLM-related projects, you will find that they don't care about the role field at all. Since they don't read it, they won't crash like LangChain. They naturally don't make any compatibility for vendor errors, they just do nothing, so everything is normal.

langchain-ai / langchain