Iodine98 / dora-back

A Python backend for Document Retrieval and Analysis (DoRA).
MIT License
0 stars 1 forks source link
docker docker-compose langchain rag retrieval-augmented-generation

dora-back

The backend for Document Retrieval and Analysis (DoRA)

Run using Poetry and Python

How to install the dependencies

Either clone this project in VSCode or open a new codespace (if you have not been invited into another one).

The devcontainer.json should contain all the plugins needed to get going including the installation of Poetry (which may need to be done manually).

Additionally, set the FILE_PATH environment variable to where you store the PDF file and include your OpenAI API key in the OPENAI_API_KEY environment variable.

Subsequently, run poetry update in the terminal to install all the dependencies and create the environment.

Allow GPU-inference for local models

Set the CMAKE_ARGS environment variable according to the llama-cpp-python documentation

Run the Flask server for the endpoints

Make sure to set all the environment variables like:

Run the Streamlit app

Run poetry run streamlit st_app.py

Run Flask server using Docker container

Please configure the values in the Dockerfile before proceeding.

Build the Docker container using

docker build -t dora-backend --build-arg OPENAI_API_KEY=<openai_api_key> .

The --build-arg are needed to provide options for local models or API keys. Please have a look at the Dockerfile to familiarize yourself with any defaults.

Run the Docker container using:

docker run --name <container_name> -p 5000:8000 dora-back \
-e <environment_variable>=<value> \
-e <environment_variable>=<value>

You can access the server at localhost:5000. Overriding the default values for the environment variables is optional.

Removing CORS and connecting to remote Vector DB

To be able to remove the CORS wrapper and connect to a remote vector database, set the CURRENT_ENV variable to PROD.

Query the MariaDB

  1. Log in to the MariaDB instance:

    docker exec -it ${CONTAINER_NAME} mariadb -u ${MARIADB_USER} -D final_answer -p \
    ${MARIADB_PASSWORD}
  2. Run the following SQL-statement for the top-5 final answers:

    SELECT TOP(5) FROM final_answer;
  3. To switch to the chat_history database:

    \u chat_history
  4. To view the top-5 chat-history items:

    SELECT TOP(5) FROM chat_history;

init.sql

The purpose of this file is to set grant privileges to the user main. I have not figured out how to parameterize this.