ajbrock / BigGAN-PyTorch

The author's officially unofficial PyTorch BigGAN implementation.
MIT License
2.84k stars 470 forks source link

cublas runtime error #64

Open phymhan opened 4 years ago

phymhan commented 4 years ago

First off, huge thanks to Andy for the PyTorch implementation! However, I encountered a cublas error after a few iterations: when using pytorch 1.5 with cuda 10.2 on two RTX 8000, File "BigGAN-PyTorch/layers.py", line 40, in power_iteration u = torch.matmul(v, W.t()) RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

when using pytorch 1.0.1 with cuda 10.0, File "BigGAN-PyTorch/layers.py", line 48, in power_iteration svs += [torch.squeeze(torch.matmul(torch.matmul(v, W.t()), u.t()))] RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1549636813070/work/aten/src/THC/THCBlas.cu:258

PS: with some modifications the error disappears, for example, using vanilla bce loss instead of hinge loss, or removing the linear layer.

Any idea why this happens? Thanks a lot!