Leon-Sander / local_multimodal_ai_chat

GNU General Public License v3.0
101 stars 66 forks source link

Local Multimodal AI Chat

Overview

Local Multimodal AI Chat is a hands-on project aimed at learning how to build a multimodal chat application. This project is all about integrating different AI models to handle audio, images, and PDFs in a single chat interface. It's a great way for anyone interested in AI and software development to get practical experience with these technologies.

The main purpose here is to learn by doing. You'll see how different pieces like Whisper AI for audio, LLaVA for image processing, and Chroma DB for PDFs come together in a chat application. A full tutorial on how I created this repository can be found on my youtube channel. But, this is still a work in progress. There's plenty of room for improvement, and that's where you come in.

I'm really open to pull requests. Whether you have ideas for new features, ways to make the code better, or just want to fix a bug, your contributions are welcome. This project is as much about learning from each other as it is about building something cool.

So, if you're interested in AI chat applications and want to dive into how they're built, join in. Your code and ideas can help make this project better for everyone who wants to learn more about building with AI.

Features

Getting Started

To get started with Local Multimodal AI Chat, clone the repository and follow these simple steps:

  1. Create a Virtual Environment: I am using Python 3.10.12 currently

  2. Upgrade pip: pip install --upgrade pip

  3. Install Requirements: pip install -r requirements.txt

    Windows Users: The installation might differ a bit for you, if you encounter errors you can't solve, please open an Issue here on github.

  4. Setting Up Local Models: Download the models you want to implement. Here is the llava model I used for image chat (ggml-model-q5_k.gguf and mmproj-model-f16.gguf). And the quantized mistral model form TheBloke (mistral-7b-instruct-v0.1.Q5_K_M.gguf).

  5. Customize config file: Check the config file and change accordingly to the models you downloaded.

  6. Optional - Change Profile Pictures: Place your user_image.pnd and/or bot_image.png inside the chat_icons folder.

  7. Enter commands in terminal:

    1. python3 database_operations.py This will initialize the sqlite database for the chat sessions.
    2. streamlit run app.py

Changelog

17.02.2024:

10.02.2024:

09.02.2024:

16.01.2024:

12.01.2024:

Possible Improvements

Contact Information

For any questions, please contact me at: