about train of question

源码中需要针对每个数据集做特殊处理吗？包括我从kaggle下载的msd数据集（手机缺陷，分为3类缺陷）配置好训练报错： 0%| | 0/62 [00:00<?, ?it/s]../aten/src/ATen/native/cuda/NLLLoss2d.cu:103: nll_loss2d_forward_kernel: block: [0,0,0], thread: [352,0,0] Assertion t >= 0 && t < n_classes failed. ../aten/src/ATen/native/cuda/NLLLoss2d.cu:103: nll_loss2d_forward_kernel: block: [0,0,0], thread: [768,0,0] Assertion t >= 0 && t < n_classes failed. ../aten/src/ATen/native/cuda/NLLLoss2d.cu:103: nll_loss2d_forward_kernel: block: [0,0,0], thread: [160,0,0] Assertion t >= 0 && t < n_classes failed. Traceback (most recent call last): File "corrmatch.py", line 323, in main() File "corrmatch.py", line 243, in main loss_u_s1 = torch.sum(loss_u_s1) / torch.sum(ignore_mask_cutmixed1 != 255).item() RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. 我自己制作的数据集也出现类似问题。

BBBBchan / CorrMatch

about train of question #18