voldemortX / pytorch-auto-drive

PytorchAutoDrive: Segmentation models (ERFNet, ENet, DeepLab, FCN...) and Lane detection models (SCNN, RESA, LSTR, LaneATT, BézierLaneNet...) based on PyTorch with fast training, visualization, benchmarking & deployment help
BSD 3-Clause "New" or "Revised" License
837 stars 137 forks source link

read mask error? only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 360, 640, 3] #118

Open mengxia1994 opened 1 year ago

mengxia1994 commented 1 year ago

the error is below: RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 360, 640, 3] I think it is because the dataloader read the mask png as rgb 3 channel. If i reshape it in pytorch-auto-drive/utils/runners/lane_det_trainer.py like labels = labels[:, :, :, 0].clone(), I got error: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 128, 45, 80]], which is output 0 of ReluBackward0, is at version 20; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). Have searched and tried several way, maybe it can only be solved at early place, like in dataloader. Please tell me where I was wrong and how to solve this. By the way, I am using custom dataset orgnized in tusimple format and used pytorch-auto-drive/tools/tusimple_list_convertor.py to generate txt(change some path). Using resa_resnet18_tusimple config. The mask png data shouldn't be a problem because I have used it to train several other open source code.

mengxia1994 commented 1 year ago

help!

voldemortX commented 1 year ago

@mengxia1994 Just to be sure, what is the format of your seg label, is it png single channel? And which line of code yielded that error message?

voldemortX commented 1 year ago

I'm guessing your labels are one-hot, in which case you need to turn them to their argmax. Or it is duplicated at the last dimension?

mengxia1994 commented 1 year ago

Thank you for your reply! the message is below: Traceback (most recent call last): _File "main_landet.py", line 65, in runner.run() File "/home/deep_learning/pytorch-auto-drive/utils/runners/lane_det_trainer.py", line 56, in run loss, log_dict = self.criterion(inputs, labels, existence, File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/deep_learning/pytorch-auto-drive/utils/losses/lane_seg_loss.py", line 27, in forward segmentation_loss = F.cross_entropy(prob_maps, targets, weight=self.weight, File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 2996, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, labelsmoothing) RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 360, 640, 3]

I just check again my seg label, they are png single channel. In simple words, every one png consists of h*w pixels, each of them belong to 0~5(background and at most 5 lanes). Is that right?

voldemortX commented 1 year ago

that is weird. if your labels are grayscale images, they should not have this problem. Maybe save some labels to files and check them? Or can you show me your dataloader code, if it was modified.