Closed ooza closed 6 months ago
Following previous works like VideoChat and VideoChat2, our model is based on Vicuna-7B v0. You should download vicuna ckpt from https://huggingface.co/lmsys/vicuna-7b-delta-v0.
Note that the downloaded files is only the delta weight, you need to follow the instructions of the README of the https://huggingface.co/lmsys/vicuna-7b-delta-v0 to apply it on top of the original LLaMA weights to get actual Vicuna weights.
I apologize for this misunderstanding and have changed the directory from model/vicuna-7b
to model/vicuna-7b-v0
in config and README files to provide clearer instructions.
Thanks a lot for your reply! I got a CUDA outofmemory! when trying to load hawkeye.pth here's my output, knowing that I'm working with a GPU K80, 12G of memory (I've 4 GPUs like this but I used only one) could you tell me how to resolve this bug please? How much ressources do I need to run the demo ? Is your code compatible with multiple GPU use?
OutOfMemoryError Traceback (most recent call last) Cell In[3], line 25 22 cfg.model.vision_encoder.num_frames = 4 24 model = HawkEye_it(config=cfg.model) ---> 25 model.set_device_ids([cfg.device])
File ~/VLM/HawkEye/models/hawkeye_it.py:210, in HawkEye_it.set_device_ids(self, device_ids) 208 self.extra_query_tokens = nn.Parameter(self.extra_query_tokens.to(self.devices[0])) 209 self.llama_proj.to(self.devices[0]) --> 210 self.llama_model.to(self.devices[0])
File ~/envs/video-text/lib/python3.9/site-packages/transformers/modeling_utils.py:1900, in PreTrainedModel.to(self, *args, *kwargs)
1895 raise ValueError(
1896 ".to
is not supported for 4-bit
or 8-bit
models. Please use the model as it is, since the"
1897 " model has already been set to the correct devices and casted to the correct dtype
."
1898 )
1899 else:
-> 1900 return super().to(args, **kwargs)
File ~/envs/video-text/lib/python3.9/site-packages/torch/nn/modules/module.py:1145, in Module.to(self, *args, **kwargs) 1141 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, 1142 non_blocking, memory_format=convert_to_format) 1143 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) -> 1145 return self._apply(convert)
File ~/envs/video-text/lib/python3.9/site-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
795 def _apply(self, fn):
796 for module in self.children():
--> 797 module._apply(fn)
799 def compute_should_use_set_data(tensor, tensor_applied):
800 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
801 # If the new tensor has compatible tensor type as the existing tensor,
802 # the current behavior is to change the tensor in-place using .data =
,
(...)
807 # global flag to let the user control whether they want the future
808 # behavior of overwriting the existing tensor or not.
File ~/envs/video-text/lib/python3.9/site-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
795 def _apply(self, fn):
796 for module in self.children():
--> 797 module._apply(fn)
799 def compute_should_use_set_data(tensor, tensor_applied):
800 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
801 # If the new tensor has compatible tensor type as the existing tensor,
802 # the current behavior is to change the tensor in-place using .data =
,
(...)
807 # global flag to let the user control whether they want the future
808 # behavior of overwriting the existing tensor or not.
[... skipping similar frames: Module._apply at line 797 (2 times)]
File ~/envs/video-text/lib/python3.9/site-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
795 def _apply(self, fn):
796 for module in self.children():
--> 797 module._apply(fn)
799 def compute_should_use_set_data(tensor, tensor_applied):
800 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
801 # If the new tensor has compatible tensor type as the existing tensor,
802 # the current behavior is to change the tensor in-place using .data =
,
(...)
807 # global flag to let the user control whether they want the future
808 # behavior of overwriting the existing tensor or not.
File ~/envs/video-text/lib/python3.9/site-packages/torch/nn/modules/module.py:820, in Module._apply(self, fn)
816 # Tensors stored in modules are graph leaves, and we don't want to
817 # track autograd history of param_applied
, so we have to use
818 # with torch.no_grad():
819 with torch.no_grad():
--> 820 param_applied = fn(param)
821 should_use_set_data = compute_should_use_set_data(param, param_applied)
822 if should_use_set_data:
File ~/envs/video-text/lib/python3.9/site-packages/torch/nn/modules/module.py:1143, in Module.to.
OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 2; 11.17 GiB total capacity; 3.25 GiB already allocated; 7.55 GiB free; 3.35 GiB allowed; 3.32 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I've resolved the problem after a deep analysis of the code. Actually your code works with multiple GPU but I just added a check on low_resource
when loading the llama model (in the hawkeye.py
file):
if not self.low_resource:
self.llama_model.to(self.devices[0])
Then I set low_resource
to True
in the config file
I've also added device_map
in the else block of the same file:
else:
self.llama_model = LlamaForCausalLM.from_pretrained(
llama_model_path, config=llama_config,
torch_dtype=torch.float16,
device_map="auto",
)
PS: we need to pip install bitsandbytes
for k-bit quantization
Hello, Could you please provide more details on how to run your trained model on a custom video ? Because the demo.py is not working! Actually there is no a folder called "model" as mentioned in the config file. I tried to create it manually and download the needed checkpoints I was able to download all models except the one of Vicuna! Can you please tell me if this is the correct version of the weights to download? https://huggingface.co/lmsys/vicuna-7b-v1.1/tree/main
Btw thanks for this great work!