mit-han-lab / temporal-shift-module

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
https://arxiv.org/abs/1811.08383
MIT License
2.05k stars 418 forks source link

nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed. #229

Closed cherylngsy closed 1 year ago

cherylngsy commented 1 year ago

Did anyone face this problem? I'm trying to run the scripts/finetune_tsm_ucf101_rgb_8f.sh with the UCF101 dataset.

Stack Trace Epoch: [0][0/476], lr: 0.00100 Time 6.718 (6.718) Data 1.149 (1.149) Loss 4.6030 (4.6030) Prec@1 0.000 (0.000) Prec@5 15.000 (15.000) /opt/conda/conda-bld/pytorch_1659484808560/work/aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertiont >= 0 && t < n_classesfailed. /opt/conda/conda-bld/pytorch_1659484808560/work/aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [11,0,0] Assertiont >= 0 && t < n_classes` failed.

RuntimeError: CUDA error: device-side assert triggered scripts/finetune_tsm_ucf101_rgb_8f.sh: line 6: 643848 Aborted (core dumped) CUDA_VISIBLE_DEVICES=1 CUDA_LAUNCH_BLOCKING=1 python main.py ucf101 RGB --arch resnet50 --num_segments 8 --gd 20 --lr 0.001 --lr_steps 10 20 --epochs 25 --batch-size 20 -j 16 --dropout 0.8 --consensus_type=avg --eval-freq=1 --shift --shift_div=8 --shift_place=blockres --tune_from=pretrained/TSM_kinetics_RGB_resnet50_shift8_blockres_avg_segment8_e50.pth`