langchain-aws has not working with some of the Bedrock Cohere and Amazon model

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

None

Error Message and Stack Trace (if applicable)

Traceback (most recent call last): ..... File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 1332, in atransform async for output in self.astream(final, config, kwargs): File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\branch.py", line 451, in astream async for chunk in self.default.astream( File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 3287, in astream async for chunk in self.atransform(input_aiter(), config, kwargs): File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 3270, in atransform async for chunk in self._atransform_stream_with_config( File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 2161, in _atransform_stream_with_config chunk: Output = await asyncio.create_task( # type: ignore[call-arg] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 3240, in _atransform async for output in final_pipeline: File "E:\Projects\CloudApper\AI-ML\Lang Sense\lang_chain_wrapper\cache.py", line 132, in atransform async for content in _input: File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\output_parsers\transform.py", line 85, in atransform async for chunk in self._atransform_stream_with_config( File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 2161, in _atransform_stream_with_config chunk: Output = await asyncio.create_task( # type: ignore[call-arg] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\output_parsers\transform.py", line 39, in _atransform async for chunk in input: File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\utils\aiter.py", line 127, in tee_peer item = await iterator.anext() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\runnables\base.py", line 1332, in atransform async for output in self.astream(final, config, **kwargs): File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 588, in astream raise e File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 581, in astream generation += chunk File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\outputs\generation.py", line 58, in add generation_info = merge_dicts( ^^^^^^^^^^^^ File "E:\Projects\CloudApper\AI-ML\Lang Sense\venv\Lib\site-packages\langchain_core\utils_merge.py", line 52, in merge_dicts raise TypeError( TypeError: Additional kwargs key index already exists in left dict and value has unsupported type <class 'int'>.

Description

Bug 1

These two models are not working cohere.command-r-plus-v1:0 and cohere.command-r-v1:0 but cohere.command-text-v14 working fine for the same request body. Error: An error occurred (ValidationException) when calling the InvokeModelWithResponseStream operation: Malformed input request: #: extraneous key [stream] is not permitted#: extraneous key [prompt] is not permitted, please reformat your input and try again. Root Cause: LLMInputOutputAdapter.prepare_input doesn't create the body accordingly for the r model as the error message clearly state that it doesn't accept stream and prompt params. I think this is because all cohere model payload are not similar to each other. If I add the prompt value to message and remove those params in the input_body dict, the api will work. Though this is not the solution for this problem. Ref: Cohere Command models: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-cohere-command.html Cohere Command R: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-cohere-command-r-plus.html

Bug 2

amazon.titan-text-premier-v1:0 raise Additional kwargs key index already exists in left dict and value has unsupported type <class 'int'>. where same request is working as expected amazon.titan-text-express-v1 See the stack trace to find the root cause.

System Info

aiohttp==3.9.5 aiosignal==1.3.1 amazon-textract-caller==0.2.4 amazon-textract-response-parser==1.0.3 amazon-textract-textractor==1.8.2 annotated-types==0.7.0 antlr4-python3-runtime==4.9.3 anyio==4.4.0 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 async-timeout==4.0.3 attrs==23.2.0 Authlib==1.3.1 azure-core==1.30.2 azure-storage-blob==12.21.0 backoff==2.2.1 beautifulsoup4==4.12.3 boto3==1.34.149 botocore==1.34.149 Brotli==1.1.0 build==1.2.1 cachetools==5.4.0 certifi==2024.7.4 cffi==1.16.0 chardet==5.2.0 charset-normalizer==3.3.2 click==8.1.7 colorama==0.4.6 coloredlogs==15.0.1 confluent-kafka==2.5.0 contextvars==2.4 contourpy==1.2.1 cryptography==43.0.0 cycler==0.12.1 dataclasses-json==0.6.7 dataclasses-json-speakeasy==0.5.11 deepdiff==7.0.1 Deprecated==1.2.14 distro==1.9.0 dnspython==2.6.1 ecdsa==0.19.0 editdistance==0.8.1 effdet==0.4.1 email_validator==2.2.0 emoji==2.12.1 environs==9.5.0 et-xmlfile==1.1.0 fastapi==0.111.1 fastapi-cli==0.0.4 filelock==3.15.4 filetype==1.2.0 flatbuffers==24.3.25 fonttools==4.53.1 frozenlist==1.4.1 fsspec==2024.6.1 google-ai-generativelanguage==0.6.6 google-api-core==2.19.1 google-api-python-client==2.138.0 google-auth==2.32.0 google-auth-httplib2==0.2.0 google-generativeai==0.7.2 googleapis-common-protos==1.63.2 graypy==2.1.0 greenlet==3.0.3 grpcio==1.63.0 grpcio-status==1.62.2 gunicorn==22.0.0 h11==0.14.0 html2text==2024.2.26 httpcore==1.0.5 httplib2==0.22.0 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.24.2 humanfriendly==10.0 idna==3.7 immutables==0.20 iniconfig==2.0.0 iopath==0.1.10 isodate==0.6.1 Jinja2==3.1.4 jmespath==1.0.1 joblib==1.4.2 jsonpatch==1.33 jsonpath-python==1.0.6 jsonpointer==3.0.0 kiwisolver==1.4.5 langchain==0.2.11 langchain-aws==0.1.16 langchain-community==0.2.10 langchain-core==0.2.29 langchain-google-genai==1.0.8 langchain-milvus==0.1.3 langchain-openai==0.1.17 langchain-postgres==0.0.9 langchain-text-splitters==0.2.2 langchainhub==0.1.20 langdetect==1.0.9 langsmith==0.1.93 layoutparser==0.3.4 lxml==5.2.2 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.3 matplotlib==3.9.1 mdurl==0.1.2 minio==7.2.7 mpmath==1.3.0 multidict==6.0.5 mutagen==1.47.0 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.3 nltk==3.8.1 numexpr==2.10.1 numpy==1.26.4 omegaconf==2.3.0 onnx==1.16.1 onnxruntime==1.18.1 openai==1.37.1 opencv-python==4.10.0.84 openpyxl==3.1.5 ordered-set==4.1.0 orjson==3.10.6 packaging==24.1 pandas==2.2.2 pdf2image==1.17.0 pdfminer.six==20231228 pdfplumber==0.11.2 pgvector==0.2.5 pikepdf==9.1.0 pillow==10.4.0 pillow_heif==0.17.0 pip-review==1.3.0 pip-tools==7.4.1 playwright==1.45.1 pluggy==1.5.0 portalocker==2.10.1 proto-plus==1.24.0 protobuf==4.25.4 psutil==6.0.0 psycopg==3.2.1 psycopg-pool==3.2.2 pyarrow==17.0.0 pyasn1==0.6.0 pyasn1_modules==0.4.0 pycocotools==2.0.8 pycparser==2.22 pycryptodome==3.20.0 pycryptodomex==3.20.0 pydantic==2.8.2 pydantic_core==2.20.1 pydub==0.25.1 pyee==11.1.0 Pygments==2.18.0 pymilvus==2.4.4 PyMuPDF==1.24.9 PyMuPDFb==1.24.9 pyparsing==3.1.2 pypdf==4.3.1 pypdfium2==4.30.0 pyproject_hooks==1.1.0 pyreadline3==3.4.1 pytesseract==0.3.10 pytest==8.3.2 pytest-subtests==0.13.1 python-dateutil==2.9.0.post0 python-docx==1.1.2 python-dotenv==1.0.1 python-iso639==2024.4.27 python-jose==3.3.0 python-magic==0.4.27 python-magic-bin==0.4.14 python-multipart==0.0.9 pytz==2024.1 pywin32==306 PyYAML==6.0.1 rapidfuzz==3.9.4 regex==2024.7.24 requests==2.32.3 requests-toolbelt==1.0.0 rich==13.7.1 rsa==4.9 s3transfer==0.10.2 safetensors==0.4.3 scipy==1.14.0 shellingham==1.5.4 six==1.16.0 sniffio==1.3.1 soupsieve==2.5 SQLAlchemy==2.0.31 starlette==0.37.2 sympy==1.13.1 tabulate==0.9.0 tenacity==8.5.0 tiktoken==0.7.0 timm==1.0.7 tokenizers==0.19.1 torch==2.2.2 torchvision==0.17.2 tqdm==4.66.4 transformers==4.43.2 typer==0.12.3 types-requests==2.32.0.20240712 typing-inspect==0.9.0 typing_extensions==4.12.2 tzdata==2024.1 ujson==5.10.0 unstructured==0.15.0 unstructured-client==0.24.1 unstructured-inference==0.7.36 unstructured.pytesseract==0.3.12 uritemplate==4.1.1 urllib3==2.2.2 uvicorn==0.30.3 watchfiles==0.22.0 websockets==12.0 wrapt==1.16.0 xlrd==2.0.1 XlsxWriter==3.2.0 yarl==1.9.4 yt-dlp==2024.7.25

Hello, could you specify what model you are using? ChatBedrock or ChatBedrockConverse?

A quick snippet to reproduce would be helpful. Thanks!

from langchain_aws import ChatBedrock, BedrockLLM
def get_language_model(self, streaming: bool = False, temperature: int = 0, callbacks: Callbacks = None) -> BaseLanguageModel:
  llm_provider = self.get_provider(self._chat_model.model)
  if llm_provider in ['cohere', 'amazon']:
         return BedrockLLM(client=client, model_id=self._chat_model.model, streaming=streaming, callbacks=callbacks, model_kwargs={"temperature": temperature})
  return ChatBedrock(client=client, model_id=self._chat_model.model, streaming=streaming, callbacks=callbacks, model_kwargs={"temperature": temperature})

 @staticmethod
  def get_provider(model: str):
      return model.split('.')[0]

Note: If I use ChatBedrock for cohere and amazon for RAG chain and create_react_agent it will produces different error. I am not sure why there are two classes (ChatBedrock & BedrockLLM) for the same feature. I am using astream and astream_events for the corresponding chain.

For cohere bug please see the reference provided (the request payload) earlier and the implementation of LLMInputOutputAdapter.prepare_input method. You will easily find the bug I hope.

@ccurme Full code:

async def bedrock_test_async(self):
        client = boto3.client(
            'bedrock-runtime',
            region_name=self.model.chat_model.bedrock_aws_region_name,
            aws_access_key_id=self.model.chat_model.bedrock_aws_access_key_id,
            aws_secret_access_key=self.model.chat_model.bedrock_aws_secret_access_key
        )
       # self.model.chat_model.model = cohere.command-r-plus-v1:0
        llm_provider = self.model.chat_model.model.split('.')[0] 
        if llm_provider in ['cohere', 'amazon']:
            llm = BedrockLLM(client=client, model_id=self.model.chat_model.model)
        else:
            llm = ChatBedrock(client=client, model_id=self.model.chat_model.model)
        async for token in llm.astream("What is RightPatient"):
            print(token)

Model list for Generate the Error:

amazon.titan-text-premier-v1:0
cohere.command-r-plus-v1:0
cohere.command-r-v1:0
ai21.jamba-instruct-v1:0

Note: We should have a facility to use ChatBedrock for all bedrock models. You see I have to use if/else block which I don't want

@ccurme Is there any update?

@efriis Could you please looking into this issue? The library is not working with any basic functionality. There are lots of bug in this library.

Hi @hasansustcse13,

Thanks for flagging these. AWS recommends using its newer Converse API for conversational applications. LangChain supports this via the ChatBedrockConverse, which appears to work on all models mentioned here.

Example:

from langchain_aws import ChatBedrockConverse

llm = ChatBedrockConverse(model="cohere.command-r-v1:0")

# Invoke

llm.invoke("hi")

AIMessage(content="Hi! How's it going? I hope you're having a fantastic day! Is there anything I can help you with?", response_metadata={'ResponseMetadata': {'RequestId': 'f2e10fa0-d616-4dfd-a624-45054e4f21af', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Tue, 20 Aug 2024 13:27:37 GMT', 'content-type': 'application/json', 'content-length': '276', 'connection': 'keep-alive', 'x-amzn-requestid': 'f2e10fa0-d616-4dfd-a624-45054e4f21af'}, 'RetryAttempts': 0}, 'stopReason': 'end_turn', 'metrics': {'latencyMs': 355}}, id='run-c01698ad-a4da-46ef-825c-60346435b3c9-0', usage_metadata={'input_tokens': 1, 'output_tokens': 25, 'total_tokens': 26})

# Stream

from langchain_core.output_parsers import StrOutputParser

chain = llm | StrOutputParser()  # if we just want text

async for chunk in chain.astream("hi"):
    print(chunk, end="|")

|Hi|!| How|'s| it| going|?| I| hope| you|'re| having| a| fantastic| day|!| Is| there| anything| I| can| help| you| with|?||||

Regarding BedrockLLM: LangChain makes a distinction between chat models-- which interact with message objects that can have roles and other metadata-- and text-in text-out LLMs. BedrockLLM is an example of the latter.

Let me know if using ChatBedrockConverse is not viable or does not solve your problem. Thanks!

@ccurme Thank you for the response. Though all the model is working fine for the above example but when I use some of the model with system prompt it produce error.

Bug 1 (Major):

Code:

 async def bedrock_test_async(self):
        client = boto3.client(
            'bedrock-runtime',
            region_name=self.model.chat_model.bedrock_aws_region_name,
            aws_access_key_id=self.model.chat_model.bedrock_aws_access_key_id,
            aws_secret_access_key=self.model.chat_model.bedrock_aws_secret_access_key
        )
        answer_prompt = ChatPromptTemplate.from_messages([
            ("system", "You are a QA chat bot name `Raven`. You can answer question from your knowledge."),
            ("human", "Query: ```{question}```"),
        ])
        callbacks = [LLMLogCallbackHandlerAsync()]
        # callbacks = []
        llm = ChatBedrockConverse(client=client, model=self.model.chat_model.model, callbacks=callbacks, temperature=0.1)

        chain = answer_prompt | llm | StrOutputParser()
        async for token in chain.astream({"question": "What is your name and What can you do?"}):
            if not callbacks:
                print(token, end='')

Error: An error occurred (ValidationException) when calling the ConverseStream operation: This model doesn't support system messages. Try again without a system message or use a model that supports system messages. Models

amazon.titan-text-express-v1
amazon.titan-text-premier-v1:0
cohere.command-text-v14
mistral.mistral-7b-instruct-v0:2
mistral.mixtral-8x7b-instruct-v0:1

Please note that some of the models (eg mistra instruct) was working if I use ChatBedrock class.

Possible Reason: It is possible that the mentioned models provider really don't support system message or langchain-aws library doesn't prepare the request body as expected. If the models don't have the system message facility langchain should convert the system message to human message or anything else similar to ChatGoogleGenerativeAI. This class has a property named convert_system_message_to_human and this property has been added after report this bug. And this flag should only react for the not supported model. langchain-ai/langchain#14710 Merged Issue

Bug 2 (Minor):

Some of the models pass array (e.g [{'type': 'text', 'text': ' M', 'index': 0}]) of object instead of str on on_llm_new_token callbacks. It should only pass the value of text property as other provider like openai gemini push only the token nothing else and also signature of the method is str

Callback:

import time
from datetime import datetime
from typing import Dict, Any, List, Optional, Union
from uuid import UUID

from colorama import Fore
from langchain_core.callbacks import AsyncCallbackHandler
from langchain_core.outputs import LLMResult, GenerationChunk, ChatGenerationChunk

class LLMLogCallbackHandlerAsync(AsyncCallbackHandler):

    def __init__(self):
        self.start: Optional[float] = None
        self.first = True
        self.llm_logs = []

    async def on_llm_start(
            self,
            serialized: Dict[str, Any],
            prompts: List[str],
            *,
            run_id: UUID,
            parent_run_id: Optional[UUID] = None,
            tags: Optional[List[str]] = None,
            metadata: Optional[Dict[str, Any]] = None,
            invocation_params: Optional[Dict[str, Any]] = None,
            **kwargs: Any,
    ) -> None:
        from sense_log import logger

        self.start = time.time()
        self.first = True
        self.llm_logs.append({'run_id': str(run_id), 'prompts': prompts})

        logger.clog(f"LLM START - {datetime.now()} RUNNER - {run_id}", Fore.RED)
        joined_prompts = '\n\n'.join(prompts)
        logger.clog(f"{joined_prompts}", Fore.GREEN)

    async def on_llm_new_token(
            self,
            token: str,
            *,
            chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
            run_id: UUID,
            parent_run_id: Optional[UUID] = None,
            tags: Optional[List[str]] = None,
            **kwargs: Any,
    ) -> None:
        if self.first:
            print(f"COMPLETION - START ON - {datetime.now()} - START AFTER - {time.time() - self.start}", Fore.YELLOW)
            self.first = False
        print(token)

    async def on_llm_end(self, response: LLMResult, *, run_id: UUID, parent_run_id: Optional[UUID] = None, **kwargs: Any) -> Any:
        texts = [generation.text for generation_list in response.generations for generation in generation_list]
        [log.update({'completion': texts}) for log in self.llm_logs if log.get('run_id') == str(run_id)]

        joined_texts = '\n\n'.join(texts)
        print(f"\nLLM END - {datetime.now()} RUNNER - {run_id}", Fore.RED)
        print(f"TOTAL COMPLETION TIME - {time.time() - self.start}", Fore.YELLOW)
        print(f"{joined_texts}", Fore.CYAN)

Models

ai21.jamba-instruct-v1:0
anthropic.claude-3-5-sonnet-20240620-v1:0
anthropic.claude-3-sonnet-20240229-v1:0
anthropic.claude-3-haiku-20240307-v1:0
cohere.command-r-plus-v1:0
cohere.command-r-v1:0
meta.llama3-8b-instruct-v1:0
meta.llama3-70b-instruct-v1:0
mistral.mistral-large-2402-v1:0

Note: Please note that some of the providers (mistral) present in both bug but the model is not same. I think 2nd bug exist in bug 1's model also. Also not that, mistra large is working as expected with system message but mistra instruct not working.

CC: @efriis

Pretty sure this error message is coming from the bedrock API, so I'd guess these models just don't support system messages on bedrock.

If you remove the system message from the prompt, it should work!

@efriis I have explained that it may possible these modes don't support system message. That's why I have mentioned that it will be nice if we introduced a flag convert_system_message_to_human same as Google Gemini model. Otherwise developer needs to handle this manually if the project needs to support Bedrcok OpneAI Gemini or other LLMs based on the user input. So I think it should handle by the package itself not by the developer who are using it. One more thing, Langchain default RAG QA chain and react_agent chain also won't work with system message. Also I have mentioned another bug which is related to callback handler.

noted and appreciate your writeup!

cc @baskaryan another type of message history conversion we may want to support

to get this functionality today, a workaround lambda would be:

llm = ChatBedrock(...)

def convert_system_message_to_human(history: list):
  return [m for m in history if not isinstance(m, SystemMessage) else HumanMessage(content=m.content)]

compatible_bedrock = convert_system_message_to_human | llm

langchain-ai / langchain-aws