NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
4.14k stars 387 forks source link

How to return multiple output keys from chain action using Python API? #179

Open donatoaz opened 12 months ago

donatoaz commented 12 months ago

I am using the Python API model to integrate a chain I already have into a rail.

this is an excerpt from my colang file

define flow default
    user ...
    $result = execute qa_chain(question=$last_user_message)

And this is how I have my code:

...
qa_chatbot = create_qa_chatbot_chain(bedrock_llm, kendra_retriever, chat_session_id)

guardrails_config = RailsConfig.from_path(os.path.join(os.path.dirname(__file__), "guardrails"))
guardrails = LLMRails(config=guardrails_config, verbose=True)
guardrails.register_action(qa_chatbot, name="qa_chain")
...

My chain returns, besides the answer key, source_documents and generated_question keys, and I need those in the rest of my code since I return them to the user.

res = guardrails.generate(prompt=question)

# I need res["answer"], res["source_documents"] and res["generated_question"]

How should I complete my default flow since I can't simply do a bot $result, without getting a:

ERROR:nemoguardrails.actions.action_dispatcher:Error ***StartUtteranceBotAction events need to provide 'script' of type 'str' while execution generate_bot_message
baravit commented 12 months ago

Hi @donatoaz What do you need to do with the other keys? Eventually, the bot should return a string answer. So either do bot $result['answer'] or pass everything to another chain to do your extra logic and return formatted string to the user, something like:

in your flow:

$final_result = execute summarize_answer($result)

in your python code:

action()
def summarize_answer(result)
    answer = result['answer']
    source_documents = result['source_documents']
    generated_question = result['generated_question']

    # Do whatever you need with them and return a string as the final bot response... 
    ...
donatoaz commented 11 months ago

Hi, so I guess I just have to wrap my head around nemo's operating model a bit more.

I am using langchain's ConversationalRetrievalChain (CRC), and at this point I am not willing to refactor to use nemo for RAG and for keeping the conversation memory/condensation.

I assume there is a way to cache the non-answer returns from CRC (source_documents, generated_question) in the action code (worst case scenario would be to use a global variable) such that after the rails has finished running I can use them, but from your answer I understand that it is just not nemo's operating model to include that as a normal use case and it feels hacky.

Maybe I'm still not fully understanding how to use NeMo in integration with langchain.

In order not to waste this issue altogether, if you'd be willing to educate me a bit more I'd be interested in contributing another example showcasing a more complex integration with langchain.

Thanks for the attention!

drazvan commented 11 months ago

Hi @donatoaz !

This is a good question. Conceptually, what you want is a way to also get some "additional context" out of the generation. As a parallel, on the input side, you can also provide additional context by providing a message with the role "context", e.g.

[
  {
    "role": "context",
    "content": {
      "user_name": "John",
      "access_level": "admin"
    }
  },
  {
    "role": "user",
    "content": "Hello!"
  }
]

Currently, there is no equivalent on the output side. We could update the interface to support returning the context as well (or a subset of it). Here's a quick sketch:

result = guardrails.generate(prompt=question, return_context=["result"])

and the output would be

{
    "completion": "...",
    "context": {
        "result": {...},
    }
}

We can also support return_context=True, which would return the full context (but this could be very verbose, depending on the configuration).

What are your thoughts on this? I think we could add support for this in the next version.

HanchenXiong commented 7 months ago

hi @drazvan, is there any update about this thread ?
Enabling extra structured output in the whole response is very needed function.

{
    "completion": "...",
    "context": {
        "result": {...},
    }
}

Regarding your question about return_context=True and could be verbose.

My thought is: the configuration should not only be about return_context, but also need to be about include_return_context in the conversation history. In some cases, the context could be gigantic and a lot details are not needed to get included in the conversation history. Therefore, we can configure, (1) return it or not and (2) include it in history or not. Ideally, this should be configured message-wise.

drazvan commented 7 months ago

Thanks for following up on this @HanchenXiong. We do have support for this now: https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/docs/user_guides/advanced/generation-options.md#output-variables. It was added in 0.8.0.