NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
3.97k stars 362 forks source link

Concept: Asyncronously passing user/session information to a guardrails action #660

Open serhatgktp opened 1 month ago

serhatgktp commented 1 month ago

Here's a scenario: we are running guardrails server, which uses FastAPI. Let's imagine that we'd like to keep track of each user that is using our guardrails server. For this, we could use LoginManager from FastAPI, imported as follows:

from fastapi_login import LoginManager

This module will allow us to keep track of the information of users that call our server API endpoints.

Now, let's also imagine an action defined in an actions.py file within a specific config. Here's our goal: We want to determine which user triggered the action.

We can start thinking about our problem by looking at the /completions endpoint. It's important to notice that this is an async method, and thus multiple users may invoke this method at a given moment in time.

So here is my question: Is it possible to define an action such that it logs (or prints) the user that triggered the action?

One potential solution is to define a mutex variable inside the API file and import that variable in the action. In doing so, the mutex will be acquired whenever the action is triggered and the value of the variable will be set to carry the current user's information. However, the caveat with this approach is that it defeats the purpose of having async endpoints.

Is there a way to pass current user information to an action without disrupting concurrency?

I'm happy to elaborate if needed. Thanks!

drazvan commented 1 month ago

Hi @serhatgktp!

The general pattern would be to use async ContextVar. As a quick example:

Let me know if you need more guidance.

serhatgktp commented 3 weeks ago

Hi @drazvan,

Could you please provide an updated example? I had referenced this when you first responded but there have been commits in these files since then and I'm not entirely sure which lines your initial links refer to.

Thanks!

drazvan commented 3 weeks ago

Sure, I was referring to raw_llm_request.set(prompt) and raw_prompt = raw_llm_request.get(), i.e., setting a context var value and retrieving it further down the callback chain. The asyncio mechanism takes care of fetching the right value based on the callback chain.