This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.
Creative Commons Zero v1.0 Universal
205
stars
50
forks
source link
IndexError in forward func of Conv3D/Sign_Isolated_Conv3D_clip.py #27
After I trained the RGB Conv3D on for one training epoch without modifying anything from the source code, after the first train epoch is finished and I get to the val_epoch, the code behaves like this:
$ python Conv3D/Sign_Isolated_Conv3D_clip.py
...
######################Training Started######################
lr: 0.001
epoch 1 | iteration 80 | Loss 5.711482 | Acc 0.00%
epoch 1 | iteration 160 | Loss 5.423379 | Acc 0.00%
epoch 1 | iteration 240 | Loss 5.502132 | Acc 14.29%
epoch 1 | iteration 320 | Loss 5.452106 | Acc 0.00%
epoch 1 | iteration 400 | Loss 5.348779 | Acc 0.00%
epoch 1 | iteration 480 | Loss 5.369306 | Acc 0.00%
epoch 1 | iteration 560 | Loss 5.412856 | Acc 0.00%
epoch 1 | iteration 640 | Loss 5.431209 | Acc 0.00%
epoch 1 | iteration 720 | Loss 5.376038 | Acc 0.00%
epoch 1 | iteration 800 | Loss 5.504383 | Acc 0.00%
epoch 1 | iteration 880 | Loss 5.414754 | Acc 0.00%
epoch 1 | iteration 960 | Loss 5.481614 | Acc 0.00%
epoch 1 | iteration 1040 | Loss 5.402166 | Acc 0.00%
epoch 1 | iteration 1120 | Loss 5.561030 | Acc 0.00%
epoch 1 | iteration 1200 | Loss 5.304134 | Acc 14.29%
epoch 1 | iteration 1280 | Loss 5.452147 | Acc 0.00%
epoch 1 | iteration 1360 | Loss 5.429211 | Acc 0.00%
epoch 1 | iteration 1440 | Loss 5.503419 | Acc 0.00%
epoch 1 | iteration 1520 | Loss 5.407657 | Acc 0.00%
epoch 1 | iteration 1600 | Loss 5.423106 | Acc 0.00%
epoch 1 | iteration 1680 | Loss 5.427852 | Acc 0.00%
epoch 1 | iteration 1760 | Loss 5.387938 | Acc 0.00%
epoch 1 | iteration 1840 | Loss 5.491746 | Acc 0.00%
epoch 1 | iteration 1920 | Loss 5.375609 | Acc 0.00%
epoch 1 | iteration 2000 | Loss 5.529760 | Acc 0.00%
epoch 1 | iteration 2080 | Loss 5.462255 | Acc 0.00%
epoch 1 | iteration 2160 | Loss 5.383886 | Acc 0.00%
epoch 1 | iteration 2240 | Loss 5.354466 | Acc 0.00%
epoch 1 | iteration 2320 | Loss 5.439829 | Acc 0.00%
epoch 1 | iteration 2400 | Loss 5.484483 | Acc 0.00%
epoch 1 | iteration 2480 | Loss 5.388660 | Acc 0.00%
epoch 1 | iteration 2560 | Loss 5.336263 | Acc 0.00%
epoch 1 | iteration 2640 | Loss 5.511293 | Acc 0.00%
epoch 1 | iteration 2720 | Loss 5.430277 | Acc 0.00%
epoch 1 | iteration 2800 | Loss 5.447950 | Acc 0.00%
epoch 1 | iteration 2880 | Loss 5.434804 | Acc 0.00%
epoch 1 | iteration 2960 | Loss 5.414961 | Acc 0.00%
epoch 1 | iteration 3040 | Loss 5.452834 | Acc 0.00%
epoch 1 | iteration 3120 | Loss 5.405386 | Acc 0.00%
epoch 1 | iteration 3200 | Loss 5.377852 | Acc 0.00%
epoch 1 | iteration 3280 | Loss 5.378382 | Acc 0.00%
epoch 1 | iteration 3360 | Loss 5.481858 | Acc 0.00%
epoch 1 | iteration 3440 | Loss 5.544360 | Acc 0.00%
epoch 1 | iteration 3520 | Loss 5.439571 | Acc 0.00%
epoch 1 | iteration 3600 | Loss 5.497654 | Acc 0.00%
epoch 1 | iteration 3680 | Loss 5.374403 | Acc 0.00%
epoch 1 | iteration 3760 | Loss 5.400540 | Acc 0.00%
epoch 1 | iteration 3840 | Loss 5.482468 | Acc 0.00%
epoch 1 | iteration 3920 | Loss 5.428809 | Acc 0.00%
epoch 1 | iteration 4000 | Loss 5.400549 | Acc 0.00%
Average Training Loss of Epoch 1: 5.445218 | Acc: 0.39%
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:490: UserWarning: This DataLoader will create 6 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
Traceback (most recent call last):
File "/content/codebase/CVPR21Chal-SLR/Conv3D/Sign_Isolated_Conv3D_clip.py", line 165, in <module>
logger, writer)
File "/content/codebase/CVPR21Chal-SLR/Conv3D/validation_clip.py", line 27, in val_epoch
loss = criterion(outputs, labels.squeeze())
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/content/codebase/CVPR21Chal-SLR/Conv3D/Sign_Isolated_Conv3D_clip.py", line 27, in forward
nll_loss = -logprobs.gather(dim=-1, index=target.unsqueeze(1))
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
Do you have any suggestions on what could be wrong here and why does the forward function present such a strange behaviour?
Even more important, what could be the solution to this problem?
After I trained the RGB Conv3D on for one training epoch without modifying anything from the source code, after the first train epoch is finished and I get to the val_epoch, the code behaves like this: