langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.69k stars 13.94k forks source link

MultiQueryRetriever bugs with bedrock #19542

Open kartmon61 opened 3 months ago

kartmon61 commented 3 months ago

Checked other resources

Example Code

TitanV1.py

from app.ai.embedding.embedding_model import EmbeddingModel
from langchain_community.embeddings import BedrockEmbeddings

class TitanV1(EmbeddingModel):
    def __init__(self, region='ap-northeast-1', streaming=False):
        self.region = region
        self.streaming = streaming

    def create(self) -> BedrockEmbeddings:
        return BedrockEmbeddings(
            credentials_profile_name="default",
            region_name=self.region,
            model_id="amazon.titan-embed-text-v1"
        )

ClaudeV1.py

import boto3
from app.ai.llm.llm_model import LLMModel, BaseModel
from langchain.llms.bedrock import Bedrock

class ClaudeV1(LLMModel):
    def __init__(self, region='ap-northeast-1', streaming=False):
        self.region = region
        self.streaming = streaming

    def create(self) -> BaseModel:
        return Bedrock(
            client=boto3.client(
                service_name="bedrock-runtime",
                region_name=self.region
            ),
            model_id="anthropic.claude-instant-v1",
            model_kwargs={"max_tokens_to_sample": 4096, "temperature": 0.0},
            streaming=self.streaming
        )

Retriever code

from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, PromptTemplate
from langchain.retrievers import ContextualCompressionRetriever, MultiQueryRetriever
from langchain.chains import RetrievalQA
from app.ai.llm.bedrock.claude_v1 import ClaudeV1
from app.ai.llm.bedrock.claude_v3 import ClaudeV3
from app.ai.embedding.bedrock.titan_v1 import TitanV1
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)
logging.getLogger("langchain.chains").setLevel(logging.INFO)

embedding_model = TitanV1().create()

vectorstore = ElasticsearchStore(
    embedding=embedding_model,
    index_name=index_name,
    es_url=es_url
)

........

# llm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0, openai_api_key=OPENAI_KEY)
llm = ClaudeV1().create()

prompt_v1 = """
   You are an assistant for question-answering tasks. 
        Use the following pieces of retrieved context to answer the question. 
        If you don't know the answer, just say that you don't know. 
        Question: {question} 
        Context: {context} 
        Answer:
"""

prompt = ChatPromptTemplate(input_variables=['context', 'question'], 
messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(
   input_variables=['context', 'question'], 
   template= prompt_v1))
])

multi_template_v1 = """As an AI language model assistant, your assignment involves creating various versions of the user's question to facilitate document retrieval from specific search methodologies. Here's your task breakdown:

1. Generate two alternative versions of the original question to improve document retrieval from a vector database. These versions should rephrase or expand on the original question to align better with how vector databases interpret queries.

2. Produce three alternative versions aimed at enhancing document retrieval using the BM25 algorithm. For all of these versions, focus exclusively on extracting key keywords from the original question. This means you should not form a complete sentence but provide a concise list of keywords that capture the essence of the query.

Please clearly categorize your alternative questions: indicate which versions are for vector database searches, which are for BM25 searches, and specifically note the version that consists solely of keywords, not a complete sentence. Separate each version with a newline for clarity.
Provide these alternative questions separated by newlines.
You should make questions only korean

Original question: {question}"""

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template=multi_template_v1,
)

r = MultiQueryRetriever.from_llm(llm=llm, retriever=en_retriever, prompt=QUERY_PROMPT)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=r,
    chain_type_kwargs={"prompt": prompt}
)

question  ="what is the search service"
result = qa_chain({"query": question})
result["result"]`

Error Message and Stack Trace (if applicable)

ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: expected minLength: 1, actual: 0, please reformat your input and try again.

Description

I'm using MultiQueryRetriever with OpenAI and Bedrock, but When I use Bedrock Imbedding Model, I got error ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: expected minLength: 1, actual: 0, please reformat your input and try again.

It is same issue with https://github.com/langchain-ai/langchain/issues/17382 this.

INFO:langchain.retrievers.multi_query:Generated queries: ['Vector database versions:', 'Describe the main functions and purpose of a search service', 'Explain what a search service does and how it works', '', 'BM25 keyword versions: ', 'search service function works', 'main functions purpose search service', 'search service how works']

I think Bedrock cannot process an empty string, but OpenAI can.

System Info

Name: langchain Version: 0.1.12 Summary: Building applications with LLMs through composability Home-page: https://github.com/langchain-ai/langchain Author: Author-email: License: MIT Location: C:\workspace\nap-agent\venv\Lib\site-packages Requires: aiohttp, dataclasses-json, jsonpatch, langchain-community, langchain-core, langchain-text-splitters, langsmith, numpy, pydantic, PyYAML, requests, SQLAlchemy, tenacity Required-by: langchain-experimental

yiuc commented 1 month ago

the same issue happen in v0.2 I added a logger and find the multi query but still got the same error. Looks like it is because the bedrock emdbedding model and langchin made malformed output

Multi query: INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can we develop BDD test cases to mitigate OWASP Top 10 API Security Risks for the admin.apps.approve functionality?', '', '2. What are the potential security vulnerabilities in the admin.apps.approve API endpoint, and how can we create BDD tests to address the OWASP Top 10 API Security Risks?', '', '3. Considering the OWASP Top 10 API Security Risks, what BDD test scenarios should be implemented to ensure the security of the admin.apps.approve API endpoint?']

Error: ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: expected minLength: 1, actual: 0, please reformat your input and try again.

INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can we develop BDD test cases to mitigate OWASP Top 10 API Security Risks for the admin.apps.approve functionality?', '', '2. What are the potential security vulnerabilities in the admin.apps.approve API endpoint, and how can we create BDD tests to address the OWASP Top 10 API Security Risks?', '', '3. Considering the OWASP Top 10 API Security Risks, what BDD test scenarios should be implemented to ensure the security of the admin.apps.approve API endpoint?']

---------------------------------------------------------------------------
ValidationException                       Traceback (most recent call last)
File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_community/embeddings/bedrock.py:135](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_community/embeddings/bedrock.py#line=134), in BedrockEmbeddings._embedding_func(self, text)
    133 try:
    134     # invoke bedrock API
--> 135     response = self.client.invoke_model(
    136         body=body,
    137         modelId=self.model_id,
    138         accept="application[/json](http://localhost:8888/json)",
    139         contentType="application[/json](http://localhost:8888/json)",
    140     )
    142     # format output based on provider

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/botocore/client.py:565](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/botocore/client.py#line=564), in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
    564 # The "self" in this scope is referring to the BaseClient.
--> 565 return self._make_api_call(operation_name, kwargs)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/botocore/client.py:1021](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/botocore/client.py#line=1020), in BaseClient._make_api_call(self, operation_name, api_params)
   1020     error_class = self.exceptions.from_code(error_code)
-> 1021     raise error_class(parsed_response, operation_name)
   1022 else:

ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: expected minLength: 1, actual: 0, please reformat your input and try again.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[98], line 6
      4 logging.basicConfig()
      5 logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.DEBUG)
----> 6 unique_docs = retriever_from_llm.invoke(question)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py:194](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py#line=193), in BaseRetriever.invoke(self, input, config, **kwargs)
    175 """Invoke the retriever to get relevant documents.
    176 
    177 Main entry point for synchronous retriever invocations.
   (...)
    191     retriever.invoke("query")
    192 """
    193 config = ensure_config(config)
--> 194 return self.get_relevant_documents(
    195     input,
    196     callbacks=config.get("callbacks"),
    197     tags=config.get("tags"),
    198     metadata=config.get("metadata"),
    199     run_name=config.get("run_name"),
    200     **kwargs,
    201 )

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:148](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/_api/deprecation.py#line=147), in deprecated.<locals>.deprecate.<locals>.warning_emitting_wrapper(*args, **kwargs)
    146     warned = True
    147     emit_warning()
--> 148 return wrapped(*args, **kwargs)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py:323](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py#line=322), in BaseRetriever.get_relevant_documents(self, query, callbacks, tags, metadata, run_name, **kwargs)
    321 except Exception as e:
    322     run_manager.on_retriever_error(e)
--> 323     raise e
    324 else:
    325     run_manager.on_retriever_end(
    326         result,
    327     )

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py:316](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py#line=315), in BaseRetriever.get_relevant_documents(self, query, callbacks, tags, metadata, run_name, **kwargs)
    314 _kwargs = kwargs if self._expects_other_args else {}
    315 if self._new_arg_supported:
--> 316     result = self._get_relevant_documents(
    317         query, run_manager=run_manager, **_kwargs
    318     )
    319 else:
    320     result = self._get_relevant_documents(query, **_kwargs)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain/retrievers/multi_query.py:164](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain/retrievers/multi_query.py#line=163), in MultiQueryRetriever._get_relevant_documents(self, query, run_manager)
    162 if self.include_original:
    163     queries.append(query)
--> 164 documents = self.retrieve_documents(queries, run_manager)
    165 return self.unique_union(documents)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain/retrievers/multi_query.py:199](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain/retrievers/multi_query.py#line=198), in MultiQueryRetriever.retrieve_documents(self, queries, run_manager)
    197 documents = []
    198 for query in queries:
--> 199     docs = self.retriever.invoke(
    200         query, config={"callbacks": run_manager.get_child()}
    201     )
    202     documents.extend(docs)
    203 return documents

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py:194](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py#line=193), in BaseRetriever.invoke(self, input, config, **kwargs)
    175 """Invoke the retriever to get relevant documents.
    176 
    177 Main entry point for synchronous retriever invocations.
   (...)
    191     retriever.invoke("query")
    192 """
    193 config = ensure_config(config)
--> 194 return self.get_relevant_documents(
    195     input,
    196     callbacks=config.get("callbacks"),
    197     tags=config.get("tags"),
    198     metadata=config.get("metadata"),
    199     run_name=config.get("run_name"),
    200     **kwargs,
    201 )

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:148](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/_api/deprecation.py#line=147), in deprecated.<locals>.deprecate.<locals>.warning_emitting_wrapper(*args, **kwargs)
    146     warned = True
    147     emit_warning()
--> 148 return wrapped(*args, **kwargs)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py:323](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py#line=322), in BaseRetriever.get_relevant_documents(self, query, callbacks, tags, metadata, run_name, **kwargs)
    321 except Exception as e:
    322     run_manager.on_retriever_error(e)
--> 323     raise e
    324 else:
    325     run_manager.on_retriever_end(
    326         result,
    327     )

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py:316](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/retrievers.py#line=315), in BaseRetriever.get_relevant_documents(self, query, callbacks, tags, metadata, run_name, **kwargs)
    314 _kwargs = kwargs if self._expects_other_args else {}
    315 if self._new_arg_supported:
--> 316     result = self._get_relevant_documents(
    317         query, run_manager=run_manager, **_kwargs
    318     )
    319 else:
    320     result = self._get_relevant_documents(query, **_kwargs)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/vectorstores.py:696](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_core/vectorstores.py#line=695), in VectorStoreRetriever._get_relevant_documents(self, query, run_manager)
    692 def _get_relevant_documents(
    693     self, query: str, *, run_manager: CallbackManagerForRetrieverRun
    694 ) -> List[Document]:
    695     if self.search_type == "similarity":
--> 696         docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
    697     elif self.search_type == "similarity_score_threshold":
    698         docs_and_similarities = (
    699             self.vectorstore.similarity_search_with_relevance_scores(
    700                 query, **self.search_kwargs
    701             )
    702         )

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_chroma/vectorstores.py:384](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_chroma/vectorstores.py#line=383), in Chroma.similarity_search(self, query, k, filter, **kwargs)
    367 def similarity_search(
    368     self,
    369     query: str,
   (...)
    372     **kwargs: Any,
    373 ) -> List[Document]:
    374     """Run similarity search with Chroma.
    375 
    376     Args:
   (...)
    382         List[Document]: List of documents most similar to the query text.
    383     """
--> 384     docs_and_scores = self.similarity_search_with_score(
    385         query, k, filter=filter, **kwargs
    386     )
    387     return [doc for doc, _ in docs_and_scores]

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_chroma/vectorstores.py:473](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_chroma/vectorstores.py#line=472), in Chroma.similarity_search_with_score(self, query, k, filter, where_document, **kwargs)
    465     results = self.__query_collection(
    466         query_texts=[query],
    467         n_results=k,
   (...)
    470         **kwargs,
    471     )
    472 else:
--> 473     query_embedding = self._embedding_function.embed_query(query)
    474     results = self.__query_collection(
    475         query_embeddings=[query_embedding],
    476         n_results=k,
   (...)
    479         **kwargs,
    480     )
    482 return _results_to_docs_and_scores(results)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_community/embeddings/bedrock.py:187](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_community/embeddings/bedrock.py#line=186), in BedrockEmbeddings.embed_query(self, text)
    178 def embed_query(self, text: str) -> List[float]:
    179     """Compute query embeddings using a Bedrock model.
    180 
    181     Args:
   (...)
    185         Embeddings for the text.
    186     """
--> 187     embedding = self._embedding_func(text)
    189     if self.normalize:
    190         return self._normalize_vector(embedding)

File [/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_community/embeddings/bedrock.py:150](http://localhost:8888/opt/homebrew/Caskroom/miniconda/base/envs/llm/lib/python3.10/site-packages/langchain_community/embeddings/bedrock.py#line=149), in BedrockEmbeddings._embedding_func(self, text)
    148         return response_body.get("embedding")
    149 except Exception as e:
--> 150     raise ValueError(f"Error raised by inference endpoint: {e}")

ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: expected minLength: 1, actual: 0, please reformat your input and try again.