Open gaelm opened 3 years ago
I haven't looked at the code but the 2x slowdown seems pretty bad. I think this should be high-pri to investigate.
Thanks for the report. Is this a pytorch 1.7.1 + cuda 10.2 wheel downloaded with pip? Can you try cuda 11.0 wheel and see if the performance difference still exists?
I have tried pytorch 1.7.1 and cuda 11.0 and the issue still exists.
quick update: same issue on pytorch 1.8.0 and cuda 11.0
Thanks for the report. I'm able to reproduce it. I have reported it to the cudnn team.
🐛 Bug
Dropout with a manually implemented stacked version of RNN/LSTM/GRU (aka split_fw below) is faster than the standard pytorch RNN/LSTM/GRU module (aka std_fw below).
Here is the profiler analysis for 20 runs.
To Reproduce
Expected behavior
I expect the standard stacked RNN to be faster than a manually written stack version.
Environment
PyTorch version: 1.7.1 Is debug build: False CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.4 LTS (x86_64) GCC version: Could not collect Clang version: Could not collect CMake version: Could not collect
Python version: 3.7 (64-bit runtime) Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce RTX 2080 Ti GPU 1: GeForce RTX 2080 Ti
Nvidia driver version: 450.36.06 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A
Versions of relevant libraries: [pip] botorch==0.3.3 [pip] gpytorch==1.3.0 [pip] numpy==1.18.5 [pip] torch==1.7.1 [pip] torchvision==0.7.0 [conda] blas 1.0 mkl
[conda] botorch 0.3.3 pypi_0 pypi [conda] cudatoolkit 10.1.243 h6bb024c_0
[conda] gpytorch 1.3.0 pypi_0 pypi [conda] mkl 2020.1 217
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.1.0 py37h23d657b_0
[conda] mkl_random 1.1.1 py37h0573a6f_0
[conda] numpy 1.18.5 py37ha1c710e_0
[conda] numpy-base 1.18.5 py37hde5b4d6_0
[conda] torch 1.7.1 pypi_0 pypi [conda] torchvision 0.7.0 py37_cu101 pytorch
Additional context
I started by asking for help here on Pytorch forums Before thinking of it as a bug.
cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @csarofeen @ptrblck @xwang233