Closed zhang-wen closed 6 years ago
/opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [94,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [94,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [94,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed. Traceback (most recent call last): File '_main.py', line 202, in main() File '_main.py', line 197, in main trainer.train() File '/home/wen/1.research/zh-en/iwslt/with_transformer/trainer.py', line 126, in train outputs = self.model(src, trg) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call result = self.forward(*input, kwargs) File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 65, in forward enc_output, enc_slf_attn = self.encoder(src_seq, src_pos) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call result = self.forward(*input, *kwargs) File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 244, in forward enc_out, enc_slf_attn = enc_layer(enc_out, src_slf_attn_mask) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call result = self.forward(input, kwargs) File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 205, in forward enc_output = self.pos_ffn(enc_output) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call result = self.forward(*input, *kwargs) File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 181, in forward output = self.dropout(self.w_2(self.relu(self.w_1(x)))) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call result = self.forward(input, **kwargs) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py', line 166, in forward self.padding, self.dilation, self.groups) File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py', line 54, in conv1d return f(input, weight, bias) RuntimeError: CUDNN_STATUS_INTERNAL_ERROR
srcIndex < srcSelectDimSize
someone got this error ? i do not know why this happen during training ..
You might consider updating your cuda driver:https://developer.nvidia.com/cudnn
@JianyuZhan thank you, i have solved my problem.
Hello,I had the same problem,how to solve?
/opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [94,0,0], thread: [29,0,0] Assertion
main()
File '_main.py', line 197, in main
trainer.train()
File '/home/wen/1.research/zh-en/iwslt/with_transformer/trainer.py', line 126, in train
outputs = self.model(src, trg)
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call
result = self.forward(*input, kwargs)
File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 65, in forward
enc_output, enc_slf_attn = self.encoder(src_seq, src_pos)
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call
result = self.forward(*input, *kwargs)
File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 244, in forward
enc_out, enc_slf_attn = enc_layer(enc_out, src_slf_attn_mask)
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call
result = self.forward(input, kwargs)
File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 205, in forward
enc_output = self.pos_ffn(enc_output)
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call
result = self.forward(*input, *kwargs)
File '/home/wen/1.research/zh-en/iwslt/with_transformer/models/transformer.py', line 181, in forward
output = self.dropout(self.w_2(self.relu(self.w_1(x))))
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py', line 325, in call
result = self.forward(input, **kwargs)
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py', line 166, in forward
self.padding, self.dilation, self.groups)
File '/home/wen/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py', line 54, in conv1d
return f(input, weight, bias)
RuntimeError: CUDNN_STATUS_INTERNAL_ERROR
srcIndex < srcSelectDimSize
failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [94,0,0], thread: [30,0,0] AssertionsrcIndex < srcSelectDimSize
failed. /opt/conda/conda-bld/pytorch_1512378360668/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [94,0,0], thread: [31,0,0] AssertionsrcIndex < srcSelectDimSize
failed. Traceback (most recent call last): File '_main.py', line 202, insomeone got this error ? i do not know why this happen during training ..