langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
91.11k stars 14.49k forks source link

SmartLLMChain errors with Bedrock Anthropic models and n_ideas>1 #11261

Closed austinmw closed 3 months ago

austinmw commented 10 months ago

System Info

langchain version v0.0.305

Python 3.11

Who can help?

@hwchase17 @agola11

Information

Related Components

Reproduction

claudev2 = BedrockChat(
    client=bedrock_inference,
    model_id="anthropic.claude-v2",
)

from langchain.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain_experimental.smart_llm import SmartLLMChain

prompt = PromptTemplate.from_template(hard_question)

chain = SmartLLMChain(
    llm=claudev2,
    prompt=prompt,
    n_ideas=2,
)

response = chain.run({} )

print(response)

Results in error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb) Cell 10 line 1
      [5](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=4) prompt = PromptTemplate.from_template(hard_question)
      [7](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=6) chain = SmartLLMChain(
      [8](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=7)     ideation_llm=claudev2,
      [9](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=8)     critique_llm=claudeinstant,
   (...)
     [15](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=14)     #verbose=True,
     [16](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=15) )
---> [18](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=17) response = chain.run({} )
     [20](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=19) print(response)

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chains/base.py:507](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chains/base.py:507), in Chain.run(self, callbacks, tags, metadata, *args, **kwargs)
    505     if len(args) != 1:
    506         raise ValueError("`run` supports only one positional argument.")
--> 507     return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
    508         _output_key
    509     ]
    511 if kwargs and not args:
    512     return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
    513         _output_key
    514     ]

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chains/base.py:312](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chains/base.py:312), in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, run_name, include_run_info)
    310 except BaseException as e:
    311     run_manager.on_chain_error(e)
--> 312     raise e
    313 run_manager.on_chain_end(outputs)
    314 final_outputs: Dict[str, Any] = self.prep_outputs(
    315     inputs, outputs, return_only_outputs
    316 )

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chains/base.py:306](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chains/base.py:306), in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, run_name, include_run_info)
    299 run_manager = callback_manager.on_chain_start(
    300     dumpd(self),
    301     inputs,
    302     name=run_name,
    303 )
    304 try:
    305     outputs = (
--> 306         self._call(inputs, run_manager=run_manager)
    307         if new_arg_supported
    308         else self._call(inputs)
    309     )
    310 except BaseException as e:
    311     run_manager.on_chain_error(e)

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain_experimental/smart_llm/base.py:166](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain_experimental/smart_llm/base.py:166), in SmartLLMChain._call(self, input_list, run_manager)
    164 ideas = self._ideate(stop, run_manager)
    165 self.history.ideas = ideas
--> 166 critique = self._critique(stop, run_manager)
    167 self.history.critique = critique
    168 resolution = self._resolve(stop, run_manager)

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain_experimental/smart_llm/base.py:293](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain_experimental/smart_llm/base.py:293), in SmartLLMChain._critique(self, stop, run_manager)
    290 callbacks = run_manager.handlers if run_manager else None
    291 if llm:
    292     critique = self._get_text_from_llm_result(
--> 293         llm.generate_prompt([prompt], stop, callbacks), step="critique"
    294     )
    295     _colored_text = get_colored_text(critique, "yellow")
    296     _text = "Critique:\n" + _colored_text

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:475](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:475), in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    467 def generate_prompt(
    468     self,
    469     prompts: List[PromptValue],
   (...)
    472     **kwargs: Any,
    473 ) -> LLMResult:
    474     prompt_messages = [p.to_messages() for p in prompts]
--> 475     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:365](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:365), in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    363         if run_managers:
    364             run_managers[i].on_llm_error(e)
--> 365         raise e
    366 flattened_outputs = [
    367     LLMResult(generations=[res.generations], llm_output=res.llm_output)
    368     for res in results
    369 ]
    370 llm_output = self._combine_llm_outputs([res.llm_output for res in results])

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:355](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:355), in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    352 for i, m in enumerate(messages):
    353     try:
    354         results.append(
--> 355             self._generate_with_cache(
    356                 m,
    357                 stop=stop,
    358                 run_manager=run_managers[i] if run_managers else None,
    359                 **kwargs,
    360             )
    361         )
    362     except BaseException as e:
    363         if run_managers:

File [~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:507](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/~/mambaforge/envs/bedrock/lib/python3.11/site-packages/langchain/chat_models/base.py:507), in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
    503     raise ValueError(
    504         "Asked to cache, but no cache found at `langchain.cache`."
    505     )
    506 if new_arg_supported:
--> 507     return self._generate(
    508         messages, stop=stop, run_manager=run_manager, **kwargs
    509     )
    510 else:
    511     return self._generate(messages, stop=stop, **kwargs)

[/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb) Cell 10 line 3
    [366](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=365)     if stop:
    [367](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=366)         params["stop_sequences"] = stop
--> [369](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=368)     completion = self._prepare_input_and_invoke(
    [370](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=369)         prompt=prompt, stop=stop, run_manager=run_manager, **params
    [371](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=370)     )
    [373](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=372) message = AIMessage(content=completion)
    [374](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=373) return ChatResult(generations=[ChatGeneration(message=message)])

[/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb) Cell 10 line 2
    [223](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=222) provider = self._get_provider()
    [224](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=223) params = {**_model_kwargs, **kwargs}
--> [225](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=224) input_body = LLMInputOutputAdapterv2.prepare_input(provider, prompt, params)
    [226](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=225) body = json.dumps(input_body)
    [227](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=226) accept = "application/json"

[/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb) Cell 10 line 7
     [73](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=72) input_body = {**model_kwargs}
     [74](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=73) if provider == "anthropic":
---> [75](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=74)     input_body["prompt"] = _human_assistant_format_v2(prompt)
     [76](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=75) elif provider == "ai21":
     [77](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=76)     input_body["prompt"] = prompt

[/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb) Cell 10 line 4
     [47](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=46)             count += 1
     [48](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=47)         else:
---> [49](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=48)             raise ValueError(ALTERNATION_ERROR)
     [51](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=50) if count % 2 == 1:  # Only saw Human, no Assistant
     [52](vscode-notebook-cell:/Users/austinmw/Desktop/bedrocktesting/bedrock_test.ipynb#X14sZmlsZQ%3D%3D?line=51)     input_text = input_text + ASSISTANT_PROMPT  # SILENT CORRECTION

ValueError: Error: Prompt must alternate between '

Human:' and '

Assistant:'.

Expected behavior

No error

dosubot[bot] commented 10 months ago

🤖

Hello,

The error you're encountering is due to the way the Bedrock class in LangChain is designed. It checks the alternation between 'Human:' and 'Assistant:' in the input text. If the alternation is not correct, it raises a ValueError with the message "Error: Prompt must alternate between '\n\nHuman:' and '\n\nAssistant:'."

Here is the relevant code:

def _human_assistant_format(input_text: str) -> str:
    ...
    count = 0
    # track alternation
    for i in range(len(input_text)):
        if input_text[i : i + len(HUMAN_PROMPT)] == HUMAN_PROMPT:
            if count % 2 == 0:
                count += 1
            else:
                raise ValueError(ALTERNATION_ERROR)
        if input_text[i : i + len(ASSISTANT_PROMPT)] == ASSISTANT_PROMPT:
            if count % 2 == 1:
                count += 1
            else:
                raise ValueError(ALTERNATION_ERROR)
    ...

In this code, count is used to track the alternation between 'Human:' and 'Assistant:'. If 'Human:' is found when count is even, count is incremented. If 'Assistant:' is found when count is odd, count is also incremented. If these conditions are not met, a ValueError is raised.

Therefore, to avoid this error, you need to ensure that your input text alternates correctly between 'Human:' and 'Assistant:'.

As for the PromptTemplate class, it is used to format the prompts that are fed into the language models in the SmartLLMChain class. The PromptTemplate class formats the prompts based on the examples and the input variables provided. The formatted prompts are then used by the SmartLLMChain class to generate responses from the language models.

Here is an example of how the PromptTemplate class might be used:

prompt = FewShotPromptWithTemplates(
    examples=[{"input": "Translate this English text to French: {text}", "output": "Traduisez ce texte anglais en français : {text}"}],
    example_prompt=PromptTemplate(template="{input}\n{output}\n"),
    suffix=StringPromptTemplate(template="Translate this English text to French: {text}"),
    input_variables=["text"],
    template_format="f-string"
)

formatted_prompt = prompt.format(text="Hello, world!")

In this example, the PromptTemplate class is used to format a prompt for a translation task. The formatted prompt is then used by the SmartLLMChain class to generate a French translation of the text "Hello, world!".

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

austinmw commented 10 months ago

Below is an example of a prompt that would fail this validation check when setting n_ideas=2, since two "Assistant:" tags come in a row:

Human: Question: What type of mammal lays the biggest eggs?
Answer: Let's work this out in a step by step way to be sure we have the right answer:

Assistant: Idea 1:  Here are the steps to determine which mammal lays the biggest eggs:

1) Mammals that lay eggs are monotremes. The only living monotremes are the platypus and four species of echidna.

2) Between the platypus and echidna, the echidna lays larger eggs. 

3) The largest echidna species is the western long-beaked echidna. 

4) The western long-beaked echidna egg is about 15 mm x 18 mm and weighs around 0.5 grams.

5) Therefore, the western long-beaked echidna lays the biggest eggs of any living mammal.

In summary:
- Monotremes (platypus and echidnas) are the only mammals that lay eggs
- Between platypus and echidnas, echidnas lay larger eggs 
- The western long-beaked echidna is the largest echidna species
- So the western long-beaked echidna lays the biggest eggs of any living mammal.

Assistant: Idea 2:  Here are the steps to determine what type of mammal lays the biggest eggs:

1. Mammals are divided into three groups based on how they give birth: monotremes, marsupials, and placental mammals.

2. Monotremes are the only mammals that lay eggs. The monotremes are the platypus and four species of echidna. 

3. The platypus lays small eggs, around the size of a grape. 

4. Echidnas lay a single soft-shelled, leathery egg at a time. Their eggs are around 13 mm across, or about 0.5 inches.

5. Therefore, of the egg-laying mammals, the echidna lays the biggest egg.

So in summary, the mammal that lays the biggest eggs is the echidna.

Human: You are a researcher tasked with investigating the 2 response options provided. List the flaws and faulty logic of each answer options. Let'w work this out in a step by step way to be sure we have all the errors:

Assistant:
seabasshn commented 10 months ago

Hey,

how can I fix the same issue when running this:

from langchain.chains.summarize import load_summarize_chain summary_chain = load_summarize_chain(llm=llm, chain_type="map_reduce", verbose=False) output = summary_chain.run(docs)

Error:

`--------------------------------------------------------------------------- ValidationException Traceback (most recent call last) File /opt/conda/lib/python3.8/site-packages/langchain/llms/bedrock.py:191, in Bedrock._call(self, prompt, stop, run_manager, **kwargs) 190 try: --> 191 response = self.client.invoke_model( 192 body=body, modelId=self.model_id, accept=accept, contentType=contentType 193 ) 194 text = LLMInputOutputAdapter.prepare_output(provider, response)

File /opt/conda/lib/python3.8/site-packages/botocore/client.py:535, in ClientCreator._create_api_method.._api_call(self, *args, **kwargs) 534 # The "self" in this scope is referring to the BaseClient. --> 535 return self._make_api_call(operation_name, kwargs)

File /opt/conda/lib/python3.8/site-packages/botocore/client.py:980, in BaseClient._make_api_call(self, operation_name, api_params) 979 error_class = self.exceptions.from_code(error_code) --> 980 raise error_class(parsed_response, operation_name) 981 else:

ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invalid prompt: prompt must start with "

Human:" turn, prompt must end with "

Assistant:" turn

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last) Cell In[19], line 1 ----> 1 output = summary_chain.run(docs)

File /opt/conda/lib/python3.8/site-packages/langchain/chains/base.py:451, in Chain.run(self, callbacks, tags, metadata, *args, **kwargs) 449 if len(args) != 1: 450 raise ValueError("run supports only one positional argument.") --> 451 return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[ 452 _output_key 453 ] 455 if kwargs and not args: 456 return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[ 457 _output_key 458 ]

File /opt/conda/lib/python3.8/site-packages/langchain/chains/base.py:258, in Chain.call(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info) 256 except (KeyboardInterrupt, Exception) as e: 257 run_manager.on_chain_error(e) --> 258 raise e 259 run_manager.on_chain_end(outputs) 260 final_outputs: Dict[str, Any] = self.prep_outputs( 261 inputs, outputs, return_only_outputs 262 )

File /opt/conda/lib/python3.8/site-packages/langchain/chains/base.py:252, in Chain.call(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info) 246 run_manager = callback_manager.on_chain_start( 247 dumpd(self), 248 inputs, 249 ) 250 try: 251 outputs = ( --> 252 self._call(inputs, run_manager=run_manager) 253 if new_arg_supported 254 else self._call(inputs) 255 ) 256 except (KeyboardInterrupt, Exception) as e: 257 run_manager.on_chain_error(e)

File /opt/conda/lib/python3.8/site-packages/langchain/chains/combine_documents/base.py:106, in BaseCombineDocumentsChain._call(self, inputs, run_manager) 104 # Other keys are assumed to be needed for LLM prediction 105 other_keys = {k: v for k, v in inputs.items() if k != self.input_key} --> 106 output, extra_return_dict = self.combine_docs( 107 docs, callbacks=_run_manager.get_child(), **other_keys 108 ) 109 extra_return_dict[self.output_key] = output 110 return extra_return_dict

File /opt/conda/lib/python3.8/site-packages/langchain/chains/combine_documents/map_reduce.py:210, in MapReduceDocumentsChain.combine_docs(self, docs, token_max, callbacks, kwargs) 198 def combine_docs( 199 self, 200 docs: List[Document], (...) 203 kwargs: Any, 204 ) -> Tuple[str, dict]: 205 """Combine documents in a map reduce manner. 206 207 Combine by mapping first chain over all documents, then reducing the results. 208 This reducing can be done recursively if needed (if there are many documents). 209 """ --> 210 map_results = self.llm_chain.apply( 211 # FYI - this is parallelized and so it is fast. 212 [{self.document_variable_name: d.page_content, **kwargs} for d in docs], 213 callbacks=callbacks, 214 ) 215 question_result_key = self.llm_chain.output_key 216 result_docs = [ 217 Document(page_content=r[question_result_key], metadata=docs[i].metadata) 218 # This uses metadata from the docs, and the textual results from results 219 for i, r in enumerate(map_results) 220 ]

File /opt/conda/lib/python3.8/site-packages/langchain/chains/llm.py:186, in LLMChain.apply(self, input_list, callbacks) 184 except (KeyboardInterrupt, Exception) as e: 185 run_manager.on_chain_error(e) --> 186 raise e 187 outputs = self.create_outputs(response) 188 run_manager.on_chain_end({"outputs": outputs})

File /opt/conda/lib/python3.8/site-packages/langchain/chains/llm.py:183, in LLMChain.apply(self, input_list, callbacks) 178 run_manager = callback_manager.on_chain_start( 179 dumpd(self), 180 {"input_list": input_list}, 181 ) 182 try: --> 183 response = self.generate(input_list, run_manager=run_manager) 184 except (KeyboardInterrupt, Exception) as e: 185 run_manager.on_chain_error(e)

File /opt/conda/lib/python3.8/site-packages/langchain/chains/llm.py:102, in LLMChain.generate(self, input_list, run_manager) 100 """Generate LLM result from inputs.""" 101 prompts, stop = self.prep_prompts(input_list, run_manager=run_manager) --> 102 return self.llm.generate_prompt( 103 prompts, 104 stop, 105 callbacks=run_manager.get_child() if run_manager else None, 106 **self.llm_kwargs, 107 )

File /opt/conda/lib/python3.8/site-packages/langchain/llms/base.py:451, in BaseLLM.generate_prompt(self, prompts, stop, callbacks, kwargs) 443 def generate_prompt( 444 self, 445 prompts: List[PromptValue], (...) 448 kwargs: Any, 449 ) -> LLMResult: 450 prompt_strings = [p.to_string() for p in prompts] --> 451 return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)

File /opt/conda/lib/python3.8/site-packages/langchain/llms/base.py:582, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, kwargs) 573 raise ValueError( 574 "Asked to cache, but no cache found at langchain.cache." 575 ) 576 run_managers = [ 577 callback_manager.on_llm_start( 578 dumpd(self), [prompt], invocation_params=params, options=options 579 )[0] 580 for callback_manager, prompt in zip(callback_managers, prompts) 581 ] --> 582 output = self._generate_helper( 583 prompts, stop, run_managers, bool(new_arg_supported), kwargs 584 ) 585 return output 586 if len(missing_prompts) > 0:

File /opt/conda/lib/python3.8/site-packages/langchain/llms/base.py:488, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs) 486 for run_manager in run_managers: 487 run_manager.on_llm_error(e) --> 488 raise e 489 flattened_outputs = output.flatten() 490 for manager, flattened_output in zip(run_managers, flattened_outputs):

File /opt/conda/lib/python3.8/site-packages/langchain/llms/base.py:475, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, kwargs) 465 def _generate_helper( 466 self, 467 prompts: List[str], (...) 471 kwargs: Any, 472 ) -> LLMResult: 473 try: 474 output = ( --> 475 self._generate( 476 prompts, 477 stop=stop, 478 # TODO: support multiple run managers 479 run_manager=run_managers[0] if run_managers else None, 480 **kwargs, 481 ) 482 if new_arg_supported 483 else self._generate(prompts, stop=stop) 484 ) 485 except (KeyboardInterrupt, Exception) as e: 486 for run_manager in run_managers:

File /opt/conda/lib/python3.8/site-packages/langchain/llms/base.py:961, in LLM._generate(self, prompts, stop, run_manager, kwargs) 958 new_arg_supported = inspect.signature(self._call).parameters.get("run_manager") 959 for prompt in prompts: 960 text = ( --> 961 self._call(prompt, stop=stop, run_manager=run_manager, kwargs) 962 if new_arg_supported 963 else self._call(prompt, stop=stop, **kwargs) 964 ) 965 generations.append([Generation(text=text)]) 966 return LLMResult(generations=generations)

File /opt/conda/lib/python3.8/site-packages/langchain/llms/bedrock.py:197, in Bedrock._call(self, prompt, stop, run_manager, **kwargs) 194 text = LLMInputOutputAdapter.prepare_output(provider, response) 196 except Exception as e: --> 197 raise ValueError(f"Error raised by bedrock service: {e}") 199 if stop is not None: 200 text = enforce_stop_tokens(text, stop)

ValueError: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Invalid prompt: prompt must start with "

Human:" turn, prompt must end with "

Assistant:" turn`

priestleyn commented 8 months ago

I get the same problem when using ConversationChain using version 0.0.344 and python 3.9.11. Code to replicate:

`from langchain.chat_models import BedrockChat from langchain.chains import ConversationChain from langchain.memory import ConversationBufferMemory

llm = BedrockChat(model_id="anthropic.claude-v2", model_kwargs={"temperature": 0.0}) memory = ConversationBufferMemory() conversation = ConversationChain( llm=llm, memory = memory, verbose=False ) conversation.predict(input="Hi, my name is Andrew")`

[.../.venv/lib/python3.9/site-packages/langchain/llms/bedrock.py:52].../.venv/lib/python3.9/site-packages/langchain/llms/bedrock.py:52): UserWarning: Error: Prompt must alternate between '

Human:' and '

Assistant:'. Received

Human: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Andrew AI:

Assistant: warnings.warn(ALTERNATION_ERROR + f" Received {input_text}")

DamianJot commented 8 months ago

If this response @dosubot (October 1) is true

def _human_assistant_format(input_text: str) -> str:
    ...
    count = 0
    # track alternation
    for i in range(len(input_text)):
        if input_text[i : i + len(HUMAN_PROMPT)] == HUMAN_PROMPT:
            if count % 2 == 0:
                count += 1
            else:
                raise ValueError(ALTERNATION_ERROR)
        if input_text[i : i + len(ASSISTANT_PROMPT)] == ASSISTANT_PROMPT:
            if count % 2 == 1:
                count += 1
            else:
                raise ValueError(ALTERNATION_ERROR)
    ...

In this code, count is used to track the alternation between 'Human:' and 'Assistant:'. If 'Human:' is found when count is even, count is incremented. If 'Assistant:' is found when count is odd, count is also incremented. If these conditions are not met, a ValueError is raised.

I think there may be a real problem with some functionalities of LangChain. Like for example adding ConversationBufferWindowMemory(). Let me make my point

We define prompt for our memory buffer ConversationBufferWindowMemory() Let it be for simplicity:

SUMMARY_TEMPLATE_ANTHROPIC = """\n\nHuman:
Progressively summarize the lines of conversation provided between <nl></nl> tags, adding their summary to previous summary provided betweem <s></s> tags and return a new updated summary.

Current summary:\n<s>{summary}</s>\nNew lines of conversation:\n<nl>{new_lines}<\nl>\n\nAssistant:"""

and define our memory buffer

summary_memory = ConversationSummaryMemory(
    llm=llm, # anthropic model
    memory_key='history', # not important now for us
    input_key="question", # not important now for us
    human_prefix='Human', # ISSUE 2 (Change to something different than Human)
    ai_prefix='Assistant', # ISSUE 3 (Change to something different than Assistant)
    #template=SUMMARY_TEMPLATE, # OUR TEMPLATE SHOULD GO HERE BUT SOMEHOW IT USED DEFAULT PROMT NOT DEFINED BY US (IDENFIFIED PROBLEM 1)
    max_token_limit=100, # optional
    return_messages=True # optional
)
summary_memory.prompt.template = SUMMARY_TEMPLATE_ANTHROPIC # (FIX OF PROBLEM 1)

then using this memory if we want to update memory buffer we call the LLM and the prompt that goes to it look like that (look what is going between <.nl> <\nl> tags:

{
  "prompts": [
    "**Human:**'Progressively summarize the lines of conversation provided between <nl></nl> tags, adding their summary to previous summary provided betweem <s></s> tags and return a new updated summary.\n\nCurrent summary:\n<s></s>\n\nNew lines of conversation:\n**<nl>Human: Hello\nAssistant: Hello, I am Kuehne + Nagel Artificial Intelligence Assistant. It will be a pleasure to help you with your questions about anything eShipAsia.<\nl>**\\n\\nAssistant:"
  ]
}

We are violating this rule that always after Human go Assistant. In our case it is Human, Human, Assistant, Assistant. Then we got the warning from LangChain if we are in langchain.debug = True mode. But actually the Anthropic can handle that complexly well (and we got our update despite warning). Actually the method of checking is everything is formatted in good way which is part of LangChain generate warning.

If we want to solve it just go to ISSUE 2 (change to eg. Pearson), ISSUE 3 (change to eg. AI) and problem solved in this case. But only for this scenario.... If somehow Human or Assistant get generated from LLM response and it will be somewhere in {summary} or {new_lines} we are again screw :D What is the option? separate call to reformat the output and get rid of those words? Some parsers? What if we want to use other Chains where such words start to be passed around (ant this is what I actually tested). We got red warning everywhere... Still everything go fine and LLM is able to process it but out screen get flood by those warning which are not deal breaker for LLM. This is quite annoying and if you have lot of chains or calls figuring out for each of it to really efficiently use langchain.debug = True wis pain in... shoe :D

IN SHORT: Using some of methods that modified prompt based on other LLM call (which is generative in nature) can not guaranty you that you will somewhere not generate those problematic words (Human, Assistant) and it will not triggered warning.

rizblie commented 7 months ago

I have a similar issue . I am using the following template for ConversationalChat with Claude 2.1 via Bedrock. Python 3.12, langchain 0.0.354.

        prompt_template = PromptTemplate.from_template("""
            Human: The following is a friendly conversation between a human and an AI.
            The AI is talkative and provides lots of specific details from its context. If the AI does not know
            the answer to a question, it truthfully says it does not know.

            Current conversation:
            <conversation_history>
            {history}
            </conversation_history>

            Here is the human's next reply:
            <human_reply>
            {input}
            </human_reply>

            Assistant:
            """)

When {history} is expanded, this gives two consecutive "Human:" prefixes. This results in a Bedrock warning:

/opt/python/langchain_community/llms/bedrock.py:57: UserWarning: Error: Prompt must alternate between 'Human:' and 'Assistant:'.

But Claude still seems to return good results.

rizblie commented 7 months ago

Found this, could be useful: https://github.com/langchain-ai/langchain/issues/11220

russell-dot-js commented 6 months ago

I just pushed a fix, please check it out: https://github.com/langchain-ai/langchain/pull/16968

FWIW you can fix before it's merged by extending the Bedrock model and implementing _convert_input from my PR, use that instead of the base Bedrock