Hi~,I have great admiration for your wonderful work. However, when I try to run python train_warm.py to reproduce the experiment, I get the following error:
INFO: <All keys matched successfully>
INFO:
--- load TEST dataset ---
/opt/conda/lib/python3.8/site-packages/torch/optim/sgd.py:69: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
super(SGD, self).__init__(params, defaults)
/tmp/pip-req-build-lxbsys88/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [256,0,0] Assertion `t >= 0 && t < n_classes` failed.
/tmp/pip-req-build-lxbsys88/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [257,0,0] Assertion `t >= 0 && t < n_classes` failed.
/tmp/pip-req-build-lxbsys88/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [258,0,0] Assertion `t >= 0 && t < n_classes` failed.
/tmp/pip-req-build-lxbsys88/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [352,0,0] Assertion `t >= 0 && t < n_classes` failed.
/tmp/pip-req-build-lxbsys88/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [255,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
File "train_warm.py", line 522, in <module>
main()
File "train_warm.py", line 410, in main
loss_seg = seg_loss(pred, labels)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1047, in forward
return F.cross_entropy(input, target, weight=self.weight,
File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 2690, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 2387, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: CUDA error: device-side assert triggered
I would like to ask if you have encountered a similar problem? How can I solve this problem? Thank you!
Hi~,I have great admiration for your wonderful work. However, when I try to run
python train_warm.py
to reproduce the experiment, I get the following error:I would like to ask if you have encountered a similar problem? How can I solve this problem? Thank you!