I tried to run this model in Gradio GUI on Windows 10 but I had a few issues.
It was loading weights to CPU
Weights were loaded seemingly in FP32, so it was overflowing my VRAM and was therefore super slow.
I modified Inference.py (the one in deepseek_vl\serve) a bit to fix those issues.
I also made sure that my torch was installed with cuda 11.8 and not cpu-only mode.
So, if someone else runs into problems when running this model on 24GB of VRAM, this issue might help you.
Installation instructions (assuming you're already in a virtual env, which you should be using)
git clone https://github.com/deepseek-ai/DeepSeek-VL
cd DeepSeek-VL
pip install torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -e .[gradio]
##replace inference.py file in the repo in folder deepseek_vl/serve with the one provided by me
##if you have the model downloaded locally, maybe change the path in app_deepseek.py to a local one
python deepseek_vl/serve/app_deepseek.py
I tried to run this model in Gradio GUI on Windows 10 but I had a few issues.
I modified Inference.py (the one in deepseek_vl\serve) a bit to fix those issues. I also made sure that my torch was installed with cuda 11.8 and not cpu-only mode.
So, if someone else runs into problems when running this model on 24GB of VRAM, this issue might help you.
Installation instructions (assuming you're already in a virtual env, which you should be using)
Here's the Inference.py that works for me https://gist.github.com/adamo1139/511f63c01c6088d7747f47628ffc970c
I will be closing it down, just want to leave a trace that will hopefully save some time for others.