How to download the ready LLaVA-Lightening-7B weights

SIGMIND commented 2 months ago

As mentioned on the offline demo readme, Alternatively you can download the ready LLaVA-Lightening-7B weights from mmaaz60/LLaVA-Lightening-7B-v1-1. THe Hugging Face repo has files named pytorch_model-00001-of-00002.bin and pytorch_model-00002-of-00002.bin Should I convert the model to gguf format to be used for offline demo?

mmaaz60 commented 2 months ago

Hi @SIGMIND,

No conversion is required, you can directly clone it from huggingface as below,

git lfs install
git clone https://huggingface.co/mmaaz60/LLaVA-7B-Lightening-v1-1

Then, download projection weights as

git clone https://huggingface.co/MBZUAI/Video-ChatGPT-7B

Finally you should be able to run the demo as,

python video_chatgpt/demo/video_demo.py 
        --model-name LLaVA-7B-Lightening-v1-1 \
        --projection_path Video-ChatGPT-7B/video_chatgpt-7B.bin

I hope it will help. Let me know if you have any questions. Thanks

SIGMIND commented 2 months ago

Thanks, the steps helped moving forward with the models. However, is there any specific GPU requirement specification for running this locally? I have tried to run it on RTX 2060 but getting error as bellow: python video_chatgpt/demo/video_demo.py 2024-04-18 17:52:45 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, model_name='LLaVA-7B-Lightening-v1-1', vision_tower_name='openai/clip-vit-large-patch14', conv_mode='video-chatgpt_v1', projection_path='/mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin') 2024-04-18 17:52:45 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, model_name='LLaVA-7B-Lightening-v1-1', vision_tower_name='openai/clip-vit-large-patch14', conv_mode='video-chatgpt_v1', projection_path='/mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin') You are using a model of type llava to instantiate a model of type VideoChatGPT. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|██████████████████████████████████████████████████████████████████████ | 1/2 [05:36<05:36, 336.88s/it] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [08:46<00:00, 250.17s/it] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [08:46<00:00, 263.40s/it] 2024-04-18 18:01:32 | ERROR | stderr | preprocessor_config.json: 0%| | 0.00/316 [00:00<?, ?B/s] preprocessor_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 316/316 [00:00<00:00, 1.74MB/s] 2024-04-18 18:01:43 | ERROR | stderr | 2024-04-18 18:01:57 | INFO | stdout | Loading weights from /mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin 2024-04-18 18:02:24 | INFO | stdout | Weights loaded from /mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin 2024-04-18 18:02:24 | ERROR | stderr | Traceback (most recent call last): 2024-04-18 18:02:24 | ERROR | stderr | File "/mnt/sdc1/Video-ChatGPT/video_chatgpt/demo/video_demo.py", line 264, in <module> 2024-04-18 18:02:24 | ERROR | stderr | initialize_model(args.model_name, args.projection_path) 2024-04-18 18:02:24 | ERROR | stderr | File "/mnt/sdc1/Video-ChatGPT/video_chatgpt/eval/model_utils.py", line 131, in initialize_model 2024-04-18 18:02:24 | ERROR | stderr | model = model.cuda() 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in cuda 2024-04-18 18:02:24 | ERROR | stderr | return self._apply(lambda t: t.cuda(device)) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply 2024-04-18 18:02:24 | ERROR | stderr | module._apply(fn) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply 2024-04-18 18:02:24 | ERROR | stderr | module._apply(fn) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply 2024-04-18 18:02:24 | ERROR | stderr | param_applied = fn(param) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in <lambda> 2024-04-18 18:02:24 | ERROR | stderr | return self._apply(lambda t: t.cuda(device)) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init 2024-04-18 18:02:24 | ERROR | stderr | torch._C._cuda_init() 2024-04-18 18:02:24 | ERROR | stderr | RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

biphobe commented 2 months ago

It seems that last sentence from your logs indicate a driver issue.

SIGMIND commented 2 months ago

Understood and that is resolved. But how much GPU memory is required to run it offline? I have 12 GB RTX 2060 and getting this error

2024-04-28 15:27:44 | ERROR | stderr |     return self._apply(lambda t: t.cuda(device))
2024-04-28 15:27:44 | ERROR | stderr | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 11.73 GiB total capacity; 11.26 GiB already allocated; 26.88 MiB free; 11.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

biphobe commented 2 months ago

I've run the model locally on RTX 2070 SUPER successfully and also I've also run the model in the cloud with no issues.

Your problem seems related to your setup. Try closing every app on your system and then run the model. In my case during my initial local attempt, the browser was reserving GPU memory and caused errors you just mentioned.

mbzuai-oryx / Video-ChatGPT

How to download the ready LLaVA-Lightening-7B weights #97