DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.7k stars 243 forks source link

Fixed training interrupt bug #123

Open bobo0810 opened 10 months ago

bobo0810 commented 10 months ago

Before repair:

TypeError: Caught TypeError in DataLoader worker process 6.

  File "/video_llama/datasets/datasets/webvid_datasets.py", line 70, in __getitem__

    video_path = self._get_video_path(sample_dict)

  File "/video_llama/datasets/datasets/webvid_datasets.py", line 50, in _get_video_path

    rel_video_fp = os.path.join(sample['page_dir'], str(sample['videoid']) + '.mp4')

  File "/opt/conda/lib/python3.10/posixpath.py", line 76, in join

    a = os.fspath(a)

TypeError: expected str, bytes or os.PathLike object, not float

After repair:

Train: data epoch: [1]  [ 150/2500]  eta: 0:10:20  lr: 0.000098  loss: 2.7766  time: 0.2573  data: 0.0000  max mem: 53623

[15:49:40]ERROR opening: /alluxio/multi-data/webvid/val_file/nan/24205120.mp4, No such file or directory

Failed to load examples with video: /alluxio/multi-data/webvid/val_file/nan/24205120.mp4. Will randomly sample an example as a replacement.

Train: data epoch: [1]  [ 200/2500]  eta: 0:10:04  lr: 0.000098  loss: 2.3127  time: 0.2587  data: 0.0000  max mem: 53623
bobo0810 commented 10 months ago

Fixed a bug that caused training to be interrupted when page_dir was Nan