Closed w5688414 closed 4 years ago
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [39,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [39,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize failed. Traceback (most recent call last): File "run.py", line 57, in main(cmd_args) File "run.py", line 39, in main trainer.train(train_dataset=train_dataset) File "/home/wugaosheng/OpenTransformer/otrans/train.py", line 91, in train train_loss = self.train_one_epoch(epoch, train_loader.loader) File "/home/wugaosheng/OpenTransformer/otrans/train.py", line 138, in train_oneepoch loss = self.model(*batch) File "/home/wugaosheng/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(input, **kwargs) File "/home/wugaosheng/OpenTransformer/otrans/model/transformer.py", line 54, in forward logits, = self.decoder(target_in, targets_length, enc_states, enc_mask) File "/home/wugaosheng/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "/home/wugaosheng/OpenTransformer/otrans/decoder.py", line 39, in forward dec_output = self.pos_encoding(dec_output) File "/home/wugaosheng/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(input, **kwargs) File "/home/wugaosheng/OpenTransformer/otrans/module.py", line 51, in forward self.extend_pe(x) File "/home/wugaosheng/OpenTransformer/otrans/module.py", line 30, in extend_pe self.pe = self.pe.to(dtype=x.dtype, device=x.device) RuntimeError: CUDA error: device-side assert triggered
srcIndex < srcSelectDimSize
I meet the cuda error
Please check the vocabulary size in the configuration file. If the vocab_size is set less than the actual number, it may cause this problem.
@ZhengkunTian thanks, it works
I meet the cuda error