PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
https://arxiv.org/pdf/2311.10122.pdf
Apache License 2.0
3.04k stars 220 forks source link

Error when loading released model on huggingface #131

Open Leo-Yuyang opened 8 months ago

Leo-Yuyang commented 8 months ago

Hi, I met an error when testing the model runing CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/run_qa_msvd.sh I just downloaded Video-LLaVA-7B from https://huggingface.co/LanguageBind/Video-LLaVA-7B by git clone and git lfs pull. And then I run the order above. I met an error:

`/data/venv/video/lib/python3.10/site-packages/torchvision/transforms/_functional_video.py:6: UserWarning: The 'torchvision.transforms._functional_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms.functional' module instead. warnings.warn( /data/venv/video/lib/python3.10/site-packages/torchvision/transforms/_transforms_video.py:22: UserWarning: The 'torchvision.transforms._transforms_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms' module instead. warnings.warn( /data/venv/video/lib/python3.10/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be removed in 0.17. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional. warnings.warn( Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last): File "/data/venv/video/lib/python3.10/site-packages/transformers/modeling_utils.py", line 460, in load_state_dict return torch.load(checkpoint_file, map_location="cpu") File "/data/venv/video/lib/python3.10/site-packages/torch/serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/data/venv/video/lib/python3.10/site-packages/torch/serialization.py", line 283, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/data/venv/video/lib/python3.10/site-packages/transformers/modeling_utils.py", line 464, in load_state_dict if f.read(7) == "version": File "/usr/lib/python3.10/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 128: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/repos/Video-LLaVA/videollava/eval/video/run_inference_video_qa.py", line 179, in run_inference(args) File "/data/repos/Video-LLaVA/videollava/eval/video/run_inference_video_qa.py", line 107, in run_inference tokenizer, model, processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name) File "/data/repos/Video-LLaVA/videollava/model/builder.py", line 107, in load_pretrained_model model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) File "/data/venv/video/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/data/venv/video/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3246, in _load_pretrained_model state_dict = load_state_dict(shard_file) File "/data/venv/video/lib/python3.10/site-packages/transformers/modeling_utils.py", line 476, in load_state_dict raise OSError( OSError: Unable to load weights from pytorch checkpoint file for 'checkpoints/Video-LLaVA-7B/pytorch_model-00001-of-00002.bin' at 'checkpoints/Video-LLaVA-7B/pytorch_model-00001-of-00002.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. scripts/v1_5/eval/run_qa_tgif.sh: line 37: eval/GPT_Zero_Shot_QA/TGIF_Zero_Shot_QA/Video-LLaVA-7B/merge.jsonl: No such file or directory scripts/v1_5/eval/run_qa_tgif.sh: line 41: eval/GPT_Zero_Shot_QA/TGIF_Zero_Shot_QA/Video-LLaVA-7B/merge.jsonl: No such file or directory ` the version of transformers is 4.31.0 and when I test other models I finetuned by myself I can load and inference successfully but failed when loading your released model. Looking forward to your reply!