ansari-project / ansari-backend

Ansari is an AI assistant to help Muslims practice more effectively and non-Muslims to understand Islam
77 stars 14 forks source link

Add Continuous Integration support #6

Closed waleedkadous closed 7 months ago

waleedkadous commented 1 year ago

Difficulty: Medium Est time: 20 hours

At the moment, we do not have continuous integration. As a result, we sometimes ship versions of Ansari that may have issues. The goal is to add continuous integration to:

abdullah-alnahas commented 8 months ago

Assalamu alaikum brother M Waleed,

JazakAllahu Khairan for your work! May Allah SWT accept it.

I'd like to contribute to adding continuous integration. Here's my understanding of the issue and a proposed plan:

Proposed Plan:

  1. Write Pytest Tests:

    • Develop Pytest tests, guided by the logic in "api_v2_exercise.py" and "evals/batik/generate-answers.ipynb". These tests will focus on maintaining the accuracy of existing answers and catching potential errors in new code.
  2. Update Workflow (".github/workflows/python-app.yml"):

    • Integrate a service to launch a temporary Postgres server instance for testing purposes.
    • Execute the "sql/01_create_tables.sql" query to establish the required database structures.

Questions:

I'm ready to start implementing this plan!

waleedkadous commented 8 months ago

Wa alaikum assalam Br Abdullah! May Allah reward you for going through the source code and understanding exactly what needs to be done.

Yes, it does integrate with my vision -- it's like you read my mind. You've covered the two types of testing I was hoping would be covered: normal CI testing, and answer quality testing.

Perhaps most important is testing to ensure the API implements permissions correctly: e.g. ensuring that a particular thread is only accessible to the person who created the thread.

abdullah-alnahas commented 8 months ago

I've begun addressing this issue. As a first step, I successfully ran Ansari locally and documented the necessary setup process. These instructions could be incorporated into a contribution guide or the README. I'm happy to collaborate on integrating them into the project's documentation!

Next Steps:

  1. Develop pytest tests for the API presenter.
  2. Create pytest tests specifically for the Gradio presenter, ensuring the absence of "cross-messaging" issues.
  3. Implement pytest tests to guarantee the quality of generated answers.
  4. Modify the workflow (".github/workflows/python-app.yml").

I'd appreciate your thoughts on these proposed tests.

Running Ansari Locally

Prerequisites

Environment Setup

  1. Export API Keys: Set the following environment variables, replacing placeholders:

    export OPENAI_API_KEY="your_openai_api_key"
    export LANGFUSE_SECRET_KEY="your_langfuse_secret_key"
    export LANGFUSE_PUBLIC_KEY="your_langfuse_public_key"
    export KALEMAT_API_KEY="kalimat_visitor_api_key" 
  2. PostgreSQL Database (Ubuntu) Install, start, and configure.

    • Install PostgreSQL:

      sudo apt-get update
      sudo apt-get install postgresql postgresql-contrib
    • Start PostgreSQL Service:

      sudo systemctl start postgresql
    • Create User and Database:

      sudo su - postgres 
      createuser mwk # mwk stands for: "Mohammed Waleed Kadous" :)
      psql -c "ALTER USER mwk WITH PASSWORD 'pw';" 
      createdb -O mwk mwk 
      exit 
    • Set PostgreSQL Password:

      export PGPASSWORD="pw"
    • Configure Authentication:

      sudo -u postgres psql  # Switch to the 'postgres' user to access the database
      GRANT ALL PRIVILEGES ON DATABASE mwk TO mwk;  # Grant necessary permissions 
      SHOW hba_file;  # Find the location of the pg_hba.conf file
      sudo gedit /path/to/pg_hba.conf  # Edit the configuration file (replace '/path/to/pg_hba.conf' with the actual path)
      # In the file, find the line: local   all             all                               peer
      # Change 'peer' to 'md5', save and exit 
      sudo systemctl restart postgresql  # Restart PostgreSQL for changes to take effect

      Explanation:

      • The additional commands ensure secure authentication for your database.
      • GRANT ALL PRIVILEGES... gives the 'mwk' user full control over the 'mwk' database.
      • SHOW hba_file; helps you find the configuration file that controls database access.
      • Editing pg_hba.conf and changing 'peer' to 'md5' enforces password-based authentication, enhancing security.
    • Create Tables:

      sudo -u postgres psql
      \i ./sql/01_create_tables.sql 
      \i ./sql/02_create_user_tokens.sql
      \i ./sql/03_create_reset_tokens.sql
      \i ./sql/04_create_feedback_table.sql

Running the Application

  1. Start Backend Server:

    gunicorn -w 2 -k uvicorn.workers.UvicornWorker main_api:app
  2. Execute Exercise Script:

    python api_v2_exercise.py local
waleedkadous commented 7 months ago

Thank you for all your hard work, Abdullah. Marking this as closed.