langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.63k stars 13.93k forks source link

Encounter KeyError: 'user_agent' while using CohereRerank() #12899

Closed plaban1981 closed 3 months ago

plaban1981 commented 8 months ago

System Info

langchain==0.0.330 Python 3.10.12

Who can help?

@hwchase17

Information

Related Components

Reproduction

from langchain.retrievers import ContextualCompressionRetriever from langchain.retrievers.document_compressors import CohereRerank

# compressor = CohereRerank() #

Error Encountered

KeyError Traceback (most recent call last) in <cell line: 5>() 3 4 # ----> 5 compressor = CohereRerank() 6 # 7 compression_retriever = ContextualCompressionRetriever(

2 frames /usr/local/lib/python3.10/dist-packages/pydantic/main.cpython-310-x86_64-linux-gnu.so in pydantic.main.BaseModel.init()

/usr/local/lib/python3.10/dist-packages/pydantic/main.cpython-310-x86_64-linux-gnu.so in pydantic.main.validate_model()

/usr/local/lib/python3.10/dist-packages/langchain/retrievers/document_compressors/cohere_rerank.py in validate_environment(cls, values) 53 values, "cohere_api_key", "COHERE_API_KEY" 54 ) ---> 55 client_name = values["user_agent"] 56 values["client"] = cohere.Client(cohere_api_key, client_name=client_name) 57 return values

KeyError: 'user_agent'

Expected behavior

The below error should not have been encountered as the COHERE API Key has alreday been set and cohere.Client() is workin with the api key provided


KeyError Traceback (most recent call last) in <cell line: 5>() 3 4 # ----> 5 compressor = CohereRerank() 6 # 7 compression_retriever = ContextualCompressionRetriever(

2 frames /usr/local/lib/python3.10/dist-packages/pydantic/main.cpython-310-x86_64-linux-gnu.so in pydantic.main.BaseModel.init()

/usr/local/lib/python3.10/dist-packages/pydantic/main.cpython-310-x86_64-linux-gnu.so in pydantic.main.validate_model()

/usr/local/lib/python3.10/dist-packages/langchain/retrievers/document_compressors/cohere_rerank.py in validate_environment(cls, values) 53 values, "cohere_api_key", "COHERE_API_KEY" 54 ) ---> 55 client_name = values["user_agent"] 56 values["client"] = cohere.Client(cohere_api_key, client_name=client_name) 57 return values

KeyError: 'user_agent'

dosubot[bot] commented 8 months ago

🤖

Hello @plaban1981,

I'm Dosu, a friendly bot here to assist you with LangChain. I'm equipped to help debug issues, answer your questions, and guide you on how to contribute to the repository. While we wait for a human maintainer, I'm here to help.

I'm currently reviewing the details of your issue and will respond with a full answer shortly.

dosubot[bot] commented 8 months ago

🤖

It seems like you're encountering a KeyError for 'user_agent' when trying to use the CohereRerank() function in LangChain. This error is likely due to the 'user_agent' key not being provided when creating an instance of the CohereRerank class. In the validate_environment method of the CohereRerank class, the 'user_agent' key is expected to be present in the values dictionary. If it is not present, a KeyError will be raised.

To fix this issue, ensure that you are providing the 'user_agent' key when creating an instance of the CohereRerank class. Here is an example:

from langchain.retrievers.document_compressors import CohereRerank

compressor = CohereRerank(user_agent="my-app")

In this example, "my-app" is the identifier for the application making the request. You can replace it with your own identifier.

Additionally, I found a similar solved issue in the LangChain repository. The issue with the CohereRerank client field being required but missing can be resolved by initializing and passing the client object in cohere_rerank.py. A workaround was also provided by using custom code:

from typing import ForwardRef
from pydantic import BaseModel
from langchain.retrievers.document_compressors import CohereRerank

class CustomCohereRerank(CohereRerank):
    class Config(BaseModel.Config):
        arbitrary_types_allowed = True

CustomCohereRerank.update_forward_refs()

compressor = CustomCohereRerank(client=co)

Where co is co = Client(key).

You can find more details about this issue here.

I hope this helps! If you have any other questions, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

gardner commented 7 months ago

A workaround:

from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CohereRerank
from os import getenv

class CustomCohereRerank(CohereRerank):
    user_agent = 'gardner'

def get_compressor(retriever):
    if getenv("COHERE_API_KEY", False) != False:
        return ContextualCompressionRetriever(
            base_compressor=CustomCohereRerank(), base_retriever=retriever
        )
    else:
        return retriever
gardner commented 7 months ago

@billytrend-cohere do you recognize this part of the code?

yoch commented 7 months ago

Hi,

Same issue here with up to date Langchain version.

The workaround is simply to pass the user_agent explicitely, but according to the documentation I'm not intended to do so.

compressor = CohereRerank(
    model='rerank-multilingual-v2.0',
    user_agent="langchain", # fixes the KeyError: 'user_agent'
)

EDIT I checked the source code, and the problem is clearly coming from that https://github.com/langchain-ai/langchain/blob/1c4bfb8c5f8e429b2409762b2c7049cd0004b3b0/libs/langchain/langchain/retrievers/document_compressors/cohere_rerank.py#L43 Default values are not passed with root_validator(pre=True), it must be pre=False to work properly with user_agent default value.

gardner commented 3 months ago

@dosubot this was fixed in https://github.com/langchain-ai/langchain/pull/14029