This project offers a user-friendly interface for working with the Llama-3.2-11B-Vision and Molmo-7B-D models.
In this case, both the Llama-3.2-11B-Vision-bnb-4bit and Molmo-7B-D-bnb-4bit models need 12GB of VRAM to run.
The model selection is done via the command line:
To set up and run this project on your local machine, follow the steps below:
Copy the repository to a convenient location on your computer:
git clone <repository-url>
cd <repository-directory>
Inside the cloned repository, create a virtual environment using the following command:
python -m venv venv-ui
Activate the virtual environment using:
.\venv-ui\Scripts\activate
After activating the virtual environment, install the necessary dependencies from requirements.txt
:
pip install -r requirements.txt
Install Torch and TorchVision using separate commands:
pip install torch==2.4.1+cu121 --index-url https://download.pytorch.org/whl/cu121
and
pip install torchvision==0.19.1+cu121 --index-url https://download.pytorch.org/whl/cu121
To start the UI, you can either:
Use the run.bat script (Windows only)
Simply double-click on run.bat
or
Activate the virtual environment:
.\venv-ui\Scripts\activate
Run the Python script:
python clean-ui.py
This project is licensed under the MIT License. See the LICENSE file for more details.