IndexError when training model

adhithiyaa-git commented 2 years ago

This is the command used to train the model : python -m train --experiment_name "Spoter" --training_set_path "data/WLASL100_train_25fps.csv" --validation_set_path "data/WLASL100_val_25fps.csv" --testing_set_path "data/WLASL100_test_25fps.csv"

I get the following error after the program runs for awhile:

Starting Spoter... Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/drive/MyDrive/Spoter/train.py", line 272, in train(args) File "/content/drive/MyDrive/Spoter/train.py", line 174, in train trainloss, , _, train_acc = train_epoch(slrt_model, train_loader, cel_criterion, sgd_optimizer, device) File "/content/drive/MyDrive/Spoter/spoter/utils.py", line 19, in train_epoch loss = criterion(outputs[0], labels[0]) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py", line 1152, in forward label_smoothing=self.label_smoothing) File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2846, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) IndexError: Target 78 is out of bounds.

Changing parameters like the epochs and learning rate does not fix the issue.

matyasbohacek commented 2 years ago

Hi @adhithiyaa-git,

It seems that wou are training the WLASL100 dataset (with 100 classes) but only using the default --num_classes argument, which is 64. This would cause the IndexError.

Try adding --num_classes 100 to your training command and please let me know if this helps.

rhngpt commented 2 years ago

Hi, I was getting the same problem, and I saw this question. I tried adding the --num_classes 100 but I got another error.

The command I used to train the model:

python -m train --experiment_name trial --epochs 1 --training_set_path "C:\Users\Asus\Documents\Lecture Notes\Intro to Artificial Intelligence\spoter-main\datasets\WLASL100_train_25fps.csv" --validation_set_path "C:\Users\Asus\Documents\Lecture Notes\Intro to Artificial Intelligence\spoter-main\datasets\WLASL100_val_25fps.csv" --testing_set_path "C:\Users\Asus\Documents\Lecture Notes\Intro to Artificial Intelligence\spoter-main\datasets\WLASL100_test_25fps.csv" --num_classes 100

The error:

Starting trial...

Traceback (most recent call last): File "C:\Users\Asus\anaconda3\envs\CS5804_AI\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Asus\anaconda3\envs\CS5804_AI\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\Asus\Documents\Lecture Notes\Intro to Artificial Intelligence\spoter-main\train.py", line 271, in train(args) File "C:\Users\Asus\Documents\Lecture Notes\Intro to Artificial Intelligence\spoter-main\train.py", line 174, in train trainloss, , _, train_acc = train_epoch(slrt_model, train_loader, cel_criterion, sgd_optimizer, device) File "C:\Users\Asus\Documents\Lecture Notes\Intro to Artificial Intelligence\spoter-main\spoter\utils.py", line 19, in train_epoch loss = criterion(outputs[0], labels[0]) File "C:\Users\Asus\anaconda3\envs\CS5804_AI\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\Asus\anaconda3\envs\CS5804_AI\lib\site-packages\torch\nn\modules\loss.py", line 1047, in forward return F.cross_entropy(input, target, weight=self.weight, File "C:\Users\Asus\anaconda3\envs\CS5804_AI\lib\site-packages\torch\nn\functional.py", line 2693, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "C:\Users\Asus\anaconda3\envs\CS5804_AI\lib\site-packages\torch\nn\functional.py", line 2388, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) IndexError: Target -1 is out of bounds.

andresherrera97 commented 2 years ago

Issue is solved with the change proposed in this comment from another issue thread https://github.com/matyasbohacek/spoter/issues/2#issuecomment-1172304554

matyasbohacek / spoter

IndexError when training model #4