NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
4.21k stars 400 forks source link

How to get bot message in self_check_output #757

Closed QuyAnh2005 closed 1 month ago

QuyAnh2005 commented 2 months ago

Hi, I am using self_check_output for my code. However, I also want to remain the original message and the output from guardrails as warning. How to do that?

Pouyanpi commented 2 months ago

Hi @QuyAnh2005,

the easiest would be to update the bot refuse to respond canonical form. You can always access user message by ($user_message) and the bot response by ($bot_message). So add following to your .co file:

define bot refuse to respond 
  "I'm sorry, I can't respond to that. Blocked `$user_message` input. "

and you must note that bot self_check_input and output uses refuse to respond so maybe something like following would be helpful:

define bot refuse to respond 
  "I'm sorry, I can't respond to that. {% if user_message %}Blocked user message: '{{ user_message }}'.{% endif %}{% if bot_message %} Blocked bot message: '{{ bot_message }}'.{% endif %}"

So the above works for both self_check_input and output. Again note that there could be other flows using bot refuse to respond so you might need to adapt it.

Note: it is not recommended to show the blocked output of the output rails to the user. In 0.10.0 we have added support for RailsException that you can achieve what you intend better, it will be released soon.

I hope it helps!

QuyAnh2005 commented 2 months ago

Thank you @Pouyanpi, I will try your recommendations. Beside, I also find a similiar way

define flow self check output
  $allowed = execute self_check_output
  bot recall respond

  if not $allowed
    bot refuse to respond
    stop

define bot recall respond
  "{{ bot_message }}"

define bot refuse to respond
  "**There seems to be an anomaly here. Please check further.**"
Pouyanpi commented 2 months ago

Great, If you can use the develop branch, it might be also of interest to you to try out the RaisException and look at some examples in the library look at the flows.v1.co files.