merge_dicts in _merge can't merge different values of instance float and raises a type error

rowanamgad commented 8 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.

Example Code

from langchain_google_vertexai import ChatVertexAI, VertexAI, VertexAIEmbeddings

llm_chain = LLMChain(llm=llm, prompt=prompt_template) res = llm_chain.predict(user_prompt=user_prompt)

Error Message and Stack Trace (if applicable)

Prompted _[llm/error] [1:chain:LLMChain > 2:llm:VertexAI] [4.64s] LLM run errored with error: "TypeError(\"Additional kwargs key Finance already exists in left dict and value has unsupported type <class 'float'>.\")Traceback (most recent call last):

Description

I'm trying to use text-unicorn model through vertex ai while setting the stream parameter to true. With every chunk generated by the llm, the generation_info dict contains key-value pairs where the key is the same but the value is different with every returned generation. Acoordingly a runtime error is raised and no propeper answer is returned from the llm.

System Info

System Information

OS: Windows OS Version: 10.0.19045 Python Version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]

Package Information

langchain_core: 0.1.15 langchain: 0.0.354 langchain_community: 0.0.15 langchain_benchmarks: 0.0.10 langchain_experimental: 0.0.47 langchain_google_genai: 0.0.2 langchain_google_vertexai: 0.0.2 langchainhub: 0.1.14

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

dosubot[bot] commented 8 months ago

🤖

Hello @rowanamgad, great to see you reaching out again! Hope you're doing well.

Thank you for providing such a detailed report. It seems like the issue you're encountering is due to the merge_dicts function in _merge.py not being able to handle merging of float values. This function is designed to merge two dictionaries, but it currently only supports merging of str, dict, and list types. If the types of the values from the 'left' and 'right' dictionaries are not one of these, and they are not equal, the function raises a TypeError.

Here is the relevant part of the code:

elif type(merged[k]) != type(v):
    raise TypeError(
        f'additional_kwargs["{k}"] already exists in this message,'
        " but with a different type."
    )
elif isinstance(merged[k], str):
    merged[k] += v
elif isinstance(merged[k], dict):
    merged[k] = merge_dicts(merged[k], v)
elif isinstance(merged[k], list):
    merged[k] = merged[k] + v
else:
    raise TypeError(
        f"Additional kwargs key {k} already exists in left dict and value has "
        f"unsupported type {type(merged[k])}."
    )

As you can see, there is no condition to handle merging of float values. This is likely the cause of the TypeError you're seeing.

Unfortunately, as a technical support rep, I don't have the ability to create pull requests or issues. However, I would recommend reporting this issue to the LangChain maintainers so they can consider adding support for merging float values in a future version of the framework. You can do this by creating a new issue in the LangChain repository and providing all the details you've shared here.

In the meantime, you might want to consider a workaround. If it's possible in your use case, you could convert the float values to strings before they are passed to the merge_dicts function, and then convert them back to floats after the merge. This would avoid the TypeError, but it might not be suitable for all use cases.

I hope this helps! If you have any further questions, please don't hesitate to ask.

Sources

libs/core/langchain_core/utils/_merge.py

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

leomensah commented 8 months ago

I am getting a similar TypeError message when I enable the streaming function in ChatVertexAI.

TypeError: Additional kwargs key Finance already exists in left dict and value has unsupported type <class 'float'>.

skozlovf commented 8 months ago

The same issue with ChatVertexAI. It fails on merging ChatGenerationChunk where generation_info["'safety_attributes"] has the same keys but different values - sometimes VertexAI returns different values for the same safety_attributes.

How to reproduce:

from langchain_core.utils._merge import merge_dicts

chunk1_gen_info = {"is_blocked": False, "safety_attributes": {"Health": 0.5, "Illicit Drugs": 0.1, "Legal": 0.1}}
chunk2_gen_info = {"is_blocked": False, "safety_attributes": {"Health": 0.5, "Illicit Drugs": 0.2, "Legal": 0.1}}
merge_dicts(chunk1_gen_info, chunk2_gen_info)

Note that Illicit Drugs has different values.

Simplified version:

from langchain_core.utils._merge import merge_dicts
merge_dicts({"a": 1}, {"a": 2})

merge_dicts works incorrectly in other cases too:

assert merge_dicts({"a": None}, {"a": 0}) == {"a": 0}  # TypeError

skozlovf commented 8 months ago

Looks like VertexAI generates safety attributes for each chunk and ChatGenerationChunk/merge_dicts doesn't support that. Also @baskaryan wrote https://github.com/langchain-ai/langchain/pull/16605#pullrequestreview-1876893433:

we intentionally don't support ints as it's not clear how to merge them. in this case you would just be overwriting past ints. a better solution would be to update the dicts that are streaming to return something mergeable, like a list of ints

Related issues: #16605, #17105.

langchain-ai / langchain