NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
3.72k stars 325 forks source link

Error when using Flask, Gemini models(gemini-1.5-pro-001), and NemoGuardRails integrated with each other. #573

Open Badaloza opened 1 week ago

Badaloza commented 1 week ago

Description: I'm encountering an error while running a Flask application that integrates Gemini models with NemoGuardRails. The error seems to be related to asynchronous task handling within LLMRails.

Run below code:

colang_content = """
# define limits
define user ask politics
    "what are your political beliefs?"
    "thoughts on the president?"
    "left wing"
    "right wing"

define bot answer politics
    "I'm a shopping assistant, I don't like to talk of politics."

define flow politics
    user ask politics
    bot answer politics
    bot offer help

# here we use the chatbot for anything else
define flow
    user ...
    $answer = execute custom_agent(user_message=$last_user_message)

    bot $answer
"""

yaml_content = """
models:
    - type: main
      engine: vertexai
      model: gemini-1.5-pro-001
"""

from nemoguardrails import LLMRails, RailsConfig
from google.cloud import aiplatform
import os
from flask import Flask, request

app = Flask(__name__)

#vertexAI authentication
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'GOOGLE_APPLICATION_CREDENTIALS.json'
PROJECT_ID = "PROJECT_ID"
REGION = "us-central1"
aiplatform.init(project=PROJECT_ID, location=REGION)

class ModerationRails():
    def __init__(self):
        rail_config = RailsConfig.from_content(
            colang_content=colang_content,
            yaml_content=yaml_content
            )
        self.app = LLMRails(rail_config, verbose=False)
        print(self.app.llm)

    def custom_agent(self, user_message):
        return "Code is working fine ---- Output is : " + user_message

    def run(self, user_message):
        self.app.register_action(self.custom_agent, name="custom_agent")
        self.app.register_action_param("user_message", user_message)
        bot_message = self.app.generate(prompt=user_message)
        return bot_message

moderation_rails = ModerationRails()

@app.route('/qna', methods=['GET', 'POST'])
def home():
    if request.method == 'POST':
        input_string = request.form['question']
        length_of_string = moderation_rails.run(input_string)

        return length_of_string
    return '''
        else loop
    '''

if __name__ == '__main__':
    app.run(debug=False)

Error Message:


mylap@mypc-MacBook-Pro accelate % /usr/bin/python3 /Users/mylap/Desktop/accelate/nemo_issue.py
Fetching 7 files: 100%|█████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 19227.33it/s]
VertexAI
Params: {'model_name': 'gemini-1.5-pro-001', 'candidate_count': 1}
 * Serving Flask app 'nemo_issue' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit

Synchronous action `custom_agent` has been called.
127.0.0.1 - - [22/Jun/2024 20:01:55] "POST /qna HTTP/1.1" 200 -

Error while execution generate_user_intent: Task <Task pending name='Task-16' coro=<LLMRails.generate_async() running at /Users/mylap/Library/Python/3.8/lib/python/site-packages/nemoguardrails/rails/llm/llmrails.py:639> cb=[_run_until_complete_cb() at /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py:184]> got Future <Task pending name='Task-20' coro=<UnaryUnaryCall._invoke() running at /Users/mylap/Library/Python/3.8/lib/python/site-packages/grpc/aio/_call.py:568>> attached to a different loop
127.0.0.1 - - [22/Jun/2024 20:02:14] "POST /qna HTTP/1.1" 200 -

Error while execution generate_user_intent: Task <Task pending name='Task-24' coro=<LLMRails.generate_async() running at /Users/mylap/Library/Python/3.8/lib/python/site-packages/nemoguardrails/rails/llm/llmrails.py:639> cb=[_run_until_complete_cb() at /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py:184]> got Future <Task pending name='Task-28' coro=<UnaryUnaryCall._invoke() running at /Users/mylap/Library/Python/3.8/lib/python/site-packages/grpc/aio/_call.py:568>> attached to a different loop
127.0.0.1 - - [22/Jun/2024 20:02:15] "POST /qna HTTP/1.1" 200 -

Steps to Reproduce:

  1. Set up a Flask application with the provided code.
  2. Ensure integration with Gemini models and NemoGuardRails.
  3. Run the application.
  4. Send a POST request to http://127.0.0.1:5000/qna with the form data containing the key question and a sample question as its value.
  5. First request will not give any error and will get a response.
  6. On making the Second request it will give above error.
  7. After that ever request will show error.
    curl --location 'http://127.0.0.1:5000/qna' \
    --form 'question="how are you"' 

Python version: 3.8.9

Name: Flask Version: 2.1.3

Name: nemoguardrails Version: 0.9.0

Additional Context:

If we use openAI models there will be no error like above. Replace yaml_content in above code with below one to use openai model:

yaml_content = """
models:
- type: main
  engine: openai
  model: gpt-3.5-turbo
"""

Note: This error only occurs when using Flask and NemoGuardRails integrated with Gemini models. There is no error if we use only two of above three.

Any guidance or solution to fix this issue would be greatly appreciated.