There were no problems when I ran clothing1m and cifar10. But when I ran experment on Cifar-100 using "python Train_cifar.py --data_path ./dataset/Cifar-100 --gpuid 0 --dataset cifar100", the error came out as following:
'''
Warmup Net1
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generic/THCTensorMath.cu line=26 error=59 : device-side assert triggered
Traceback (most recent call last):
File "Train_cifar.py", line 256, in
warmup(epoch,net1,optimizer1,warmup_trainloader)
File "Train_cifar.py", line 137, in warmup
L.backward()
File "/home/zhuwang/anaconda2/envs/dividemix/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/zhuwang/anaconda2/envs/dividemix/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generic/THCTensorMath.cu:26
'''
I googled and it says that the label is out of range or might contain -1. But i can't figure it out. Has this ever occurred to you? Thank you kindly.
Hi its me again
Love all ur work and code :)
There were no problems when I ran clothing1m and cifar10. But when I ran experment on Cifar-100 using "python Train_cifar.py --data_path ./dataset/Cifar-100 --gpuid 0 --dataset cifar100", the error came out as following: ''' Warmup Net1 THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generic/THCTensorMath.cu line=26 error=59 : device-side assert triggered Traceback (most recent call last): File "Train_cifar.py", line 256, in
warmup(epoch,net1,optimizer1,warmup_trainloader)
File "Train_cifar.py", line 137, in warmup L.backward()
File "/home/zhuwang/anaconda2/envs/dividemix/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/zhuwang/anaconda2/envs/dividemix/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generic/THCTensorMath.cu:26 '''
I googled and it says that the label is out of range or might contain -1. But i can't figure it out. Has this ever occurred to you? Thank you kindly.