pytorch / extension-cpp

C++ extensions in PyTorch
1.02k stars 214 forks source link

Got 'expected 3 dims but tensor has 2 ' when run cuda benchmark #39

Closed zoezhu closed 5 years ago

zoezhu commented 5 years ago

Following is info for my environment:

After cloning the git repo, I run setup.py

$ cd $extension-cpp/cuda
$ python setup.py install

Here is what I got:

running install
running bdist_egg
running egg_info
writing lltm_cuda.egg-info/PKG-INFO
writing dependency_links to lltm_cuda.egg-info/dependency_links.txt
writing top-level names to lltm_cuda.egg-info/top_level.txt
reading manifest file 'lltm_cuda.egg-info/SOURCES.txt'
writing manifest file 'lltm_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-3.6/lltm_cuda.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating stub loader for lltm_cuda.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/lltm_cuda.py to lltm_cuda.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying lltm_cuda.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying lltm_cuda.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying lltm_cuda.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying lltm_cuda.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.lltm_cuda.cpython-36: module references __file__
creating 'dist/lltm_cuda-0.0.0-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing lltm_cuda-0.0.0-py3.6-linux-x86_64.egg
removing '/home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm_cuda-0.0.0-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm_cuda-0.0.0-py3.6-linux-x86_64.egg
Extracting lltm_cuda-0.0.0-py3.6-linux-x86_64.egg to /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages
lltm-cuda 0.0.0 is already the active version in easy-install.pth

Installed /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm_cuda-0.0.0-py3.6-linux-x86_64.egg
Processing dependencies for lltm-cuda==0.0.0
Finished processing dependencies for lltm-cuda==0.0.0

Then I run

$ cd ..
$ python benchmark.py cuda

Then error occurred,

Traceback (most recent call last):
  File "benchmark.py", line 43, in <module>
    new_h, new_C = rnn(X, (h, C))
  File "/home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xx/MyGit/cuda_test/extension-cpp/cuda/lltm.py", line 45, in forward
    return LLTMFunction.apply(input, self.weights, self.bias, *state)
  File "/home/xx/MyGit/cuda_test/extension-cpp/cuda/lltm.py", line 14, in forward
    outputs = lltm_cuda.forward(input, weights, bias, old_h, old_cell)
RuntimeError: expected 3 dims but tensor has 2 (packed_accessor at /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/torch/lib/include/ATen/core/Tensor.h:223)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f0568445cf5 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: at::PackedTensorAccessor<float, 3ul, at::RestrictPtrTraits, unsigned long> at::Tensor::packed_accessor<float, 3ul, at::RestrictPtrTraits, unsigned long>() const & + 0xd3 (0x7f0552854c59 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x2b49d (0x7f055284c49d in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x2b6e5 (0x7f055284c6e5 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #4: lltm_cuda_forward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor) + 0x2f0 (0x7f055284cad5 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #5: lltm_forward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor) + 0x1c4 (0x7f055283c454 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #6: <unknown function> + 0x23258 (0x7f0552844258 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #7: <unknown function> + 0x27c35 (0x7f0552848c35 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/lltm-0.0.0-py3.6-linux-x86_64.egg/lltm_cuda.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #14: THPFunction_apply(_object*, _object*) + 0x579 (0x7f058f9361d9 in /home/xx/anaconda3/envs/mmdet/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #38: __libc_start_main + 0xe7 (0x7f05a17abb97 in /lib/x86_64-linux-gnu/libc.so.6)

And py and cpp work fine.

haoyz commented 5 years ago

The similar problem with you. Did you solve it?

ClementPinard commented 5 years ago

Does it work if you update pytorch to 1.2 ?

haoyz commented 5 years ago

Sorry for the incorrect reply above. There was no error when I tried lltm. In fact, this error only happens on my own cell unit that similar to lltm. This is just due to the fact that I initialized the "bias" with 1 dimetion but the actual usage is 2 dimetion.

zoezhu commented 5 years ago

Does it work if you update pytorch to 1.2 ?

Thank you! Update pytorch to 1.2 works!