Open dingjibang opened 6 years ago
cuda9.0,系统win10,py3.5
使用CUDA_LAUNCH_BLOCKING=1 python3 seq2seq.py train
可以看到更多信息
现在的信息已经是blocking = 1时候的了 顺便贴上blocking = 0的时候的信息
THCudaCheck FAIL file=c:\new-builder_3\win-wheel\pytorch\aten\src\thc\THCReduceAll.cuh line=317 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
File "seq2seq.py", line 436, in
0或者1,报出来的信息都是一样的,没有更多或者更少,就是报错的行数不一样
可否把数据发我一份?
@dingjibang 这个问题你解决了吗?我也遇到了这样的问题. @yanwii 但是我关掉GPU后epoch跑到4000,报错确实段错误
你好,请问这个问题你解决了吗,我也遇到了这样的问题
项目下下来简单填了几个answer和question然后跑起来测试,发现可以运行并且效果还不错,就搞了将近2mb的answer和question,在preprocessing阶段通过,开始训练的时候就提示下面的错误了。
THCudaCheck FAIL file=C:/new-builder_3/win-wheel/pytorch/aten/src/ATen/native/cuda/Embedding.cu line=247 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
File "seq2seq.py", line 436, in
seq.train()
File "seq2seq.py", line 210, in train
loss, logits = self.step(inputs, targets, self.max_length)
File "seq2seq.py", line 265, in step
loss.backward()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\autograd__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at C:/new-builder_3/win-wheel/pytorch/aten/src/ATen/native/cuda/Embedding.cu:2 47
一脸懵逼,我该怎么办