RAG_QA

A serverless RAG QA bot for various clients

Setup:

Clone the repo
run pip3 install -r requirements.txt
Install Nvidia Toolkit TODO: Get actual name
Start the aphrodite instance sudo docker run --gpus '"all"' --shm-size 10g -p 11434:11434 -it alpindale/aphrodite-engine
Set up the model and serve it python3 -m aphrodite.endpoints.openai.api_server --model mistralai/Mixtral-8x7B-Instruct-v0.1 --kv-cache-dtype fp8_e5m2 --served-model-name mistral --max-model-len 8096 --host 0.0.0.0 --port 11434
Start the fast api server with uvicorn api:app --host 0.0.0.0 --port 1337 --reload
Try it out with curl -X POST http://localhost:1337/ask -H "Content-Type: application/json" -d '{"question": "What is the sky blue?"}'

UI

TODO: Pull Mistrall automatically at build of the container