antoyang / TubeDETR

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Apache License 2.0
171 stars 8 forks source link

Training error in tubedetr.py file. #4

Open OliverHxh opened 2 years ago

OliverHxh commented 2 years ago

I try to train the network on HC-STVGv2 dataset using the command provided in the README.md file:

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --ema \                                                                                       
  2 --load=pretrained_resnet101_checkpoint.pth --combine_datasets=hcstvg --combine_datasets_val=hcstvg \                                                                  
  3 --v2 --dataset_config config/hcstvg.json --epochs=20 --output-dir=output --batch_size=8

Unfortunately, I encountered this issue in models/tubedetr.py line 180

  File "/root/paddlejob/workspace/STVG/TubeDETR/models/tubedetr.py", line 180, in forward                                                                                 
    tpad_src = tpad_src.view(b * n_clips, f, h, w)                                                                                                                        
RuntimeError: shape '[160, 256, 7, 12]' is invalid for input of size 2817024

. Besides, the durations of the eight samples are: [100, 100, 69, 100, 65, 86, 100, 100].

I think this problem is probably related to the padding approach. Do you have any clue with this BUG and how to fix it? Thank you very much!

antoyang commented 2 years ago

All experiments I did were with a batch size of 1 video per GPU given that it already takes quite a bit of GPU memory with long videos / high resolution, so there might be some padding to fix indeed.

Glupayy commented 2 years ago

Hi, I encountered the same issue. Did you fix it?

hyundodo commented 1 year ago

Hi, I want to increase batch size, too. Did you fix it??

AKASH2907 commented 1 year ago

Hi, Was anybody able to solve this issue?