NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
4.21k stars 400 forks source link

Fix multiline LLM output syntax error for dynamic flow generation #748

Closed radinshayanfar closed 2 months ago

radinshayanfar commented 2 months ago

This PR adds one more demonstration to the sample conversation default prompts of v2 for multiline text output with bot say flow. Before this PR, any dynamic code generation with LLMs that needed more than one line resulted in syntax error. Example Colang application:

import core
import llm

flow main
  activate llm continuation

Example application trace:

> Give a list of car manufacturers                                                                                                                                                                          
⠙ Working ...WARNING:nemoguardrails.colang.v2_x.runtime.runtime:Failed parsing a generated flow
@meta(bot_intent="bot respond provide a list of car manufacturers")
flow _dynamic_e8e6537a bot respond provide a list of car manufacturers
  bot say "Sure! Here’s a list of some well-known car manufacturers:
  1. Toyota
  2. Ford
  3. Volkswagen
  4. Honda
  5. General Motors (Chevrolet, GMC, Cadillac, Buick)
  6. BMW
  7. Mercedes-Benz
  8. Audi
  9. Nissan
  10. Hyundai
  11. Kia
  12. Subaru
  13. Tesla
  14. Fiat Chrysler Automobiles (now part of Stellantis)
  15. Volvo
  16. Land Rover
  17. Mazda
  18. Mitsubishi
  19. Jaguar
  20. Porsche
No terminal matches '"' in the current parser context, at line 3 col 11

  bot say "Sure! Here’s a list of some well-known 
          ^
Expected one of: 
    * LPAR
    * PLUS
    * COLON
    * TILDE
    * EQUAL
    * LBRACE
    * DEC_NUMBER
    * "->"
    * _AND
    * VAR_NAME
    * MINUS
    * NAME
    * _NEWLINE
    * LONG_STRING
    * RPAR
    * FLOAT_NUMBER
    * STRING
    * DOT
    * _OR
    * LSQB

Previous tokens: Token('NAME', 'say')
:
  bot say "Sure! Here’s a list of some well-known car manufacturers:
          ^
Parsing failed for LLM generated flow: `_dynamic_e8e6537a bot respond provide a list of car manufacturers`

>

By adding one more sample conversation which demonstrates that multiline outputs should be concatenated with \n, the aforementioned problem is mostly fixed.

There could be another workaround, which is post-processing the LLM generated flow. However, I think it's up to the LLMs discretion to generate more valid Colang flows.