fcdl94 / WILSON

Official implementation of "Incremental Learning in Semantic Segmentation from Image Labels"
https://arxiv.org/abs/2112.01882
MIT License
59 stars 9 forks source link

RuntimeError: Class values must be smaller than num_classes. #12

Closed adaxidedakaonang closed 1 year ago

adaxidedakaonang commented 1 year ago

Hi, when I try to run the code with single GPU (I deleted the distributed-related code), I got the following error, it's very strange:

Epoch 1, lr = 0.009699: 100%|█| 154/154 [02:29<00:00,  1.03it/s, loss=0.41
13 INFO:0: Epoch 1, Class Loss=0.4461328089237213, Reg Loss=0.0
14 INFO:0: End of Epoch 1/30, Average Loss=0.4461328089237213, Class Loss=0.4461328089237213, Reg Loss=0.0
15 INFO:0: End of Validation 1/30
16 INFO:0: Epoch 2, lr = 0.009398
17 Epoch 2, lr = 0.009398: 100%|█| 154/154 [02:31<00:00,  1.02it/s, loss=0.33
18 INFO:0: Epoch 2, Class Loss=0.3124752342700958, Reg Loss=0.0
19 INFO:0: End of Epoch 2/30, Average Loss=0.3124752342700958, Class Loss=0.3124752342700958, Reg Loss=0.0
20 INFO:0: End of Validation 2/30
21 INFO:0: Epoch 3, lr = 0.009095
22 Epoch 3, lr = 0.009095:  31%|▎| 48/154 [00:47<01:46,  1.01s/it, loss=0.233Traceback (most recent call last):
23   File "run.py", line 227, in <module>
24     main(opts)
25   File "run.py", line 123, in main
26     epoch_loss = trainer.train(cur_epoch=cur_epoch, train_loader=train_loader)
27   File "/home/lttm/Desktop/chang/WILSON-main/train.py", line 194, in train
28     loss = criterion(outputs, labels)  # B x H x W
29   File "/home/lttm/miniconda3/envs/pytorch1.8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
30     result = self.forward(*input, **kwargs)
31   File "/home/lttm/Desktop/chang/WILSON-main/utils/loss.py", line 73, in forward
32     targets = F.one_hot(labels_new, inputs.shape[1] + 1).float().permute(0, 3, 1, 2)
33 RuntimeError: Class values must be smaller than num_classes.

And I can't wake up my PC. Does anyone meet this strange problem?

fcdl94 commented 1 year ago

Hey @adaxidedakaonang. The error states that your labels_new has a class index that is greater than inputs.shape[1] + 1. Please, check if the number of classes is properly set and if the labels contain some mistaken value.

adaxidedakaonang commented 1 year ago

Grazie, seems it's the problem of the GPU instead of the code. Decreasing num_workers (from 4 to 2) and decreasing batch size solve the problem......