langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.55k stars 13.91k forks source link

search_kwargs not being used in vectorstore as_retriever #21492

Open thelazydogsback opened 1 month ago

thelazydogsback commented 1 month ago

Checked other resources

Example Code

In the following code:

vector_store = AzureSearch(...)
retriever = vector_store.as_retriever(
            search_type = "similarity",
            search_kwargs = {
                "k": 8,
                "search_type": 'hybrid',
                "filters": "(x eq 'foo') and (y eq 'bar')"  # Azure AI Search filter
            }
        )

When _get_relevant_documents is called, the provided search_kwargs are not used -- the defaults (k=4, similarity, no filter) are used instead. Although these are stored in the retriever in _lc_kwargs, this doesn't seem to be referenced anywhere.

This seemed to work before -- there may have been an issue introduced somewhere in between these versions:

langchain==0.1.16 -> 0.1.17 langchain-community==0.0.32 -> 0.0.36 langchain-core==0.1.42 -> 0.1.50 langchain-openai==0.1.3 ->0.1.6

I am using the retriever above as part of a custom doc retriever -- the work-around is for me to set search_kwargs directly in the retriever returned by as_retriever right before I call get_relevant_documents, rather than depending on the args I gave as_retriever. (I believe I got an error from pydantic if I try to set these earlier.)

Error Message and Stack Trace (if applicable)

N/A

Description

I'm trying to use search_kwargs to set the ACS query options. The expected behavior is that they should be honored. What is currently happening is that defaults are used instead.

System Info

langchain==0.1.17 langchain-community==0.0.36 langchain-core==0.1.50 langchain-openai==0.1.6

Windows Python 3.10.11

qingdengyue commented 1 month ago

search_kwargs = { "k": 8, "search_type": 'hybrid', "filters": "(x eq 'foo') and (y eq 'bar')" # Azure AI Search filter } try change to retriever = vector_store.as_retriever( search_type = "hybrid", k=8, filters="(x eq 'foo') and (y eq 'bar')" )