deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding
https://huggingface.co/spaces/deepseek-ai/DeepSeek-VL-7B
MIT License
2.08k stars 195 forks source link

tips for running the model in FP16 on 24GB GPU #42

Closed adamo1139 closed 7 months ago

adamo1139 commented 7 months ago

I tried to run this model in Gradio GUI on Windows 10 but I had a few issues.

  1. It was loading weights to CPU
  2. Weights were loaded seemingly in FP32, so it was overflowing my VRAM and was therefore super slow.

I modified Inference.py (the one in deepseek_vl\serve) a bit to fix those issues. I also made sure that my torch was installed with cuda 11.8 and not cpu-only mode.

So, if someone else runs into problems when running this model on 24GB of VRAM, this issue might help you.

Installation instructions (assuming you're already in a virtual env, which you should be using)

git clone https://github.com/deepseek-ai/DeepSeek-VL
cd DeepSeek-VL
pip install torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -e .[gradio]
##replace inference.py file in the repo in folder  deepseek_vl/serve with the one provided by me
##if you have the model downloaded locally, maybe change the path in app_deepseek.py to a local one
python deepseek_vl/serve/app_deepseek.py

Here's the Inference.py that works for me https://gist.github.com/adamo1139/511f63c01c6088d7747f47628ffc970c

I will be closing it down, just want to leave a trace that will hopefully save some time for others.