Open MeowLake opened 1 year ago
Hi @MeowLake!
Let me try to shed some light into what's happening behind the scenes.
First, the guardrails process has three main stages: interpreting the user's input (a.k.a. generate canonical user message), deciding what the bot should do/say next and generating the bot utterance. In each of the three stages, the LLM can potentially be involved. This means you can have 1, 2 or 3 LLM calls. I'll show this in the 3 rounds in your examples below. But before that, there's one important observation I want to make: currently, we only use the first line from the output of the LLM. Sometimes, you will se multiple lines predicted after the chain finishes, but we discard them. We show them for information currently, and this shows that the LLM is capable of predicting potentially the whole thing in one go. We'll be optimising for that in one of the future versions.
A single call
When you say "hello there", like in your first turn, a first call is made to the LLM to generate the canonical user message (express greeting
). Next, because there is a flow in the configuration that starts with user express greeting
, that flow will be used to determine the next step i.e. bot express greeting
. And because there is already a static message for that, it will be used directly. So, no additional LLM calls were needed.
Two calls
When you ask "What can you do for me?" in the third turn, the first call to the LLM generates the ask about capabilities
canonical form. The next call generates the next step i.e. bot inform capabilities
. For the third step, the LLM is not needed since you've defined explicitly the inform capabilities
bot message.
Three calls
When you say "I fell very bad today" in your second turn, the first LLM call generates the user canonical message express negative emotion
. Because of a bug that has now been fixed, the next step is not generated from the static flow and the LLM is used to generate the next step i.e. bot express empathy
. For the third step, the LLM is used to generate the express empathy message
.
I hope this helps. This week, we'll push a new round of updates to GitHub and they will be published to PyPi at the end of the month. One of the updates is a better logging experience, which shows more clearly when the LLM is used, at what stage, what's the prompt/completion and a few other things.
@drazvan Thank you! These details are very helpful.
I tried to fit the above examples into the guardrail process but still found confusing pieces. Can you please confirm the detachment below? Let's use the 3rd turn chat as an instance:
�[1m> Entering new LLMChain chain...�[0m
�[1m> Finished chain.�[0m
�[42m�[97mask about capabilities
bot respond about capabilities
"As I mentioned earlier, I can do anything! Specifically, I can help you with tasks such as scheduling appointments, setting reminders, providing information on various topics, and even ordering food or making reservations. Is there anything specific you need help with?"�[0m
�[1m> Entering new LLMChain chain...�[0m
�[1m> Finished chain.�[0m
�[42m�[97mbot inform capabilities�[0m
bot inform capabilities
.
Issue:
Sometimes the system enters two LLMchains that execute the similar actions in conversation using 'generate' function. Is this expected or not?
Log for a single conversation:
Log for a three rounds chat (appended in history):
Repro steps:
config.yml
that only containssample_conversation
like theconfiguration guide
shows.hello.co
following thehello world
exampledef bot inform capabilities
to"I am a AI Homelander, I do whatever I want!"
Concerns:
express negative emotion
+bot express empathy
, while another is onlybot express empathy
.bot_message
does uses the defined homelander material.ChatOpenAI
withConversationChain
, only one chain is entered in each chat, so it doesn't look like an issue on the Langchain side.Package and versions:
Can you please help identify the expected/unexpected behaviors above?