NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
4.12k stars 385 forks source link

Refrain Nemo-Guardrails to Send the Actual User Input to LLM #706

Open minghongg opened 2 months ago

minghongg commented 2 months ago

Hi team,

Is it possible to configure Nemo-Guardrails to avoid sending the actual user input to the LLM? I understand that the actual user input won't be sent if the input rails are triggered. However, is it also possible to prevent the user input from being sent, regardless of whether the input rails are triggered or not? Thanks!

Drewwb commented 2 months ago

Yes, it is possible to configure NeMo Guardrails to avoid sending the actual user input to the LLM, regardless of whether input rails are triggered.

In your colang file you could add the following:

define user_input_passes_guardrails as user says something not offensive or inappropriate

when user_input_passes_guardrails:
    bot says "pass"
    action stop_processing  # This stops the input from being sent to the LLM

You could also chain another LLM to your guardrails:

In essence, you would prompt your Observer LLM in a way that makes sure it doesn't output the user's input.

Pouyanpi commented 1 month ago

@minghongg, do you mean that you want to use predefined flows only?

What do you want to do with the user input? It'd be great if you can explain your use case more

drazvan commented 1 month ago

Addi this for reference as well: https://docs.nvidia.com/nemo/guardrails/user_guides/input_output_rails_only/README.html#using-only-input-and-output-rails .