kimhc6028 / pathnet-pytorch

PyTorch implementation of PathNet: Evolution Channels Gradient Descent in Super Neural Networks
BSD 3-Clause "New" or "Revised" License
80 stars 18 forks source link

RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/generic/THCTensorMath.cu #6

Open skyde1021 opened 6 years ago

skyde1021 commented 6 years ago

First of all, thank you for the implementation. But I have this error with '--no-graph --cifar-svhn'

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/generic/THCTensorMath.cu line=26 error=59 : device-side assert triggered Traceback (most recent call last): File "/workspace/pathnet-pytorch/main.py", line 258, in main() File "/workspace/pathnet-pytorch/main.py", line 169, in main best_fitness, best_path, max_fitness = train_pathnet(model, gene, visualizer, train_loader, best_fitness, best_path, gen, 'm') File "/workspace/pathnet-pytorch/main.py", line 92, in train_pathnet fitness = model.train_model(train_data, path, args.num_batch) File "/workspace/pathnet-pytorch/pathnet.py", line 121, in train_model loss.backward() File "/opt/conda/envs/pytorch-py2.7/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/opt/conda/envs/pytorch-py2.7/lib/python2.7/site-packages/torch/autograd/init.py", line 99, in backward variables, grad_variables, retain_graph) RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/generic/THCTensorMath.cu:26 /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [6,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [7,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [8,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [11,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [12,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [13,0,0] Assertion t >= 0 && t < n_classes failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype , Dtype , Dtype , long , Dtype , int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [14,0,0] Assertion t >= 0 && t < n_classes failed.