Issue: Amazon Bedrock Cohere Command LLM - Malformed input request: 2 schema violations found

sagarduwal commented 12 months ago

System Info

Sample Snippet:

from langchain.chains.question_answering import load_qa_chain

chain = load_qa_chain(llm, chain_type = "stuff")

query = "decentralized identity"
docs = docsearch.similarity_search(query)

text = chain({"input_documents": docs, "question": query, "verbose": True})

Error:

ValueError: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: 2 schema violations found, please reformat your input and try again.

Who can help?

No response

Information

[ ] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[X] LLMs/Chat Models
[X] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

from langchain.llms.bedrock import Bedrock
from langchain.embeddings import BedrockEmbeddings

modelId = 'cohere.command-text-v14'

bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name='us-east-1')
llm = Bedrock(model_id=modelId, client=bedrock_runtime )

docs = read_and_split_docs(data_directory)

bedrock_embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v1", client=bedrock_runtime)
docsearch = Pinecone.from_texts(
    [t.page_content for t in docs],
    bedrock_embeddings,
    index_name=index_name
)
from langchain.chains.question_answering import load_qa_chain

chain = load_qa_chain(llm, chain_type = "stuff")

query = "Human: decentralized identity. Assistant:"
docs = docsearch.similarity_search(query)

text = chain({"input_documents": docs, "question": query, "verbose": True})

Expected behavior

return some response

dosubot[bot] commented 12 months ago

🤖

Hello @sagarduwal,

I'm Dosu, an AI bot here to assist you. I can help you solve bugs, answer questions, and assist you in contributing to the LangChain repository. While we wait for a human maintainer, feel free to ask me anything. I'm here to help!

Best, Dosu

Give me a moment to look at your issue. I'll post a detailed answer shortly.

dosubot[bot] commented 12 months ago

🤖

Hello @sagarduwal,

I'm Dosu, an AI bot here to assist you. I can help you solve bugs, answer questions, and assist you in contributing to the LangChain repository. While we wait for a human maintainer, feel free to ask me anything. I'm here to help!

Best, Dosu

Based on the error message you're receiving, it seems like the input data provided to the model does not match the expected schema. This could be due to the format of the prompt or the model_kwargs argument.

In the context of the LangChain framework and the Amazon Bedrock Cohere Command LLM, the LLMInputOutputAdapter.prepare_input method formats the prompt differently based on the provider. For the "amazon" provider, it sets inputText to the prompt and textGenerationConfig to model_kwargs. If these do not match the expected schema, it could lead to the error you are seeing.

Here is the relevant code:

class LLMInputOutputAdapter:
    @classmethod
    def prepare_input(
        cls, provider: str, prompt: str, model_kwargs: Dict[str, Any]
    ) -> Dict[str, Any]:
        input_body = {**model_kwargs}
        if provider == "anthropic":
            input_body["prompt"] = _human_assistant_format(prompt)
        elif provider == "ai21" or provider == "cohere":
            input_body["prompt"] = prompt
        elif provider == "amazon":
            input_body = dict()
            input_body["inputText"] = prompt
            input_body["textGenerationConfig"] = {**model_kwargs}
        else:
            input_body["inputText"] = prompt

        if provider == "anthropic" and "max_tokens_to_sample" not in input_body:
            input_body["max_tokens_to_sample"] = 256

        return input_body

(Source: libs/langchain/langchain/llms/bedrock.py)

To resolve this issue, you should ensure that the prompt and model_kwargs match the expected format and schema for the Amazon Bedrock Cohere Command LLM. If you are unsure about the expected format or schema, you should refer to the documentation or examples provided by Amazon for this model.

Additionally, there was a similar issue reported in the LangChain repository (Issue: Amazon Bedrock Cohere Command - Malformed input request: 2 schema violations found, please reformat your input and try again.). The user found that the issue was due to the LangChain library being outdated. Updating the LangChain library to the latest version (v0.0.329) resolved the issue. You might want to check your LangChain version and update it if necessary.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 12 months ago

🤖

Based on the error message you're receiving, it seems like the input data provided to the model does not match the expected schema. This could be due to the format of the prompt or the model_kwargs argument.

In the context of the LangChain framework and the Amazon Bedrock Cohere Command LLM, the LLMInputOutputAdapter.prepare_input method formats the prompt differently based on the provider. For the "amazon" provider, it sets inputText to the prompt and textGenerationConfig to model_kwargs. If these do not match the expected schema, it could lead to the error you are seeing.

Here is the relevant code:

class LLMInputOutputAdapter:
    @classmethod
    def prepare_input(
        cls, provider: str, prompt: str, model_kwargs: Dict[str, Any]
    ) -> Dict[str, Any]:
        input_body = {**model_kwargs}
        if provider == "anthropic":
            input_body["prompt"] = _human_assistant_format(prompt)
        elif provider == "ai21" or provider == "cohere":
            input_body["prompt"] = prompt
        elif provider == "amazon":
            input_body = dict()
            input_body["inputText"] = prompt
            input_body["textGenerationConfig"] = {**model_kwargs}
        else:
            input_body["inputText"] = prompt

        if provider == "anthropic" and "max_tokens_to_sample" not in input_body:
            input_body["max_tokens_to_sample"] = 256

        return input_body

This is the method that prepares the input for the model. It formats the prompt and model_kwargs based on the provider. If the provider is "amazon", it sets inputText to the prompt and textGenerationConfig to model_kwargs. If these do not match the expected schema, it could lead to the error you are seeing.

You can find this code in the libs/langchain/langchain/llms/bedrock.py file.

To resolve this issue, you should ensure that the prompt and model_kwargs match the expected format and schema for the Amazon Bedrock Cohere Command LLM. If you are unsure about the expected format or schema, you should refer to the documentation or examples provided by Amazon for this model.

Additionally, there was a similar issue reported in the LangChain repository (Issue: Amazon Bedrock Cohere Command - Malformed input request: 2 schema violations found, please reformat your input and try again.). The user found that the issue was due to the LangChain library being outdated. Updating the LangChain library to the latest version (v0.0.329) resolved the issue for them. You might want to try updating your LangChain library to the latest version as well.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 8 months ago

Hi, @sagarduwal,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, you reported a ValueError with the Amazon Bedrock Cohere Command LLM, resulting in a "Malformed input request" error with 2 schema violations. I have responded with a detailed analysis of the error message, provided code snippets from the LangChain repository, and suggested updating the LangChain library to the latest version as a potential solution.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, kindly let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

langchain-ai / langchain