pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
84.63k stars 22.79k forks source link

caffe2::CudnnConvOp::RunOnDevice() fails on Squeezenet #16299

Open drnikolaev opened 5 years ago

drnikolaev commented 5 years ago

🐛 Bug

caffe2::CudnnConvOp::RunOnDevice() fails on Squeezenet

To Reproduce

Steps to reproduce the behavior:

  1. Unzip and copy test_trt.zip to caffe2/python/trt/

  2. Run `test_trt.TensorRTTransformTest.test_squeezenet_core test:

Error
Traceback (most recent call last):
  File "/home/snikolaev/anaconda3/lib/python3.6/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/snikolaev/anaconda3/lib/python3.6/unittest/case.py", line 605, in run
    testMethod()
  File "/home/snikolaev/pytorch2/caffe2/python/trt/test_trt.py", line 668, in test_squeezenet_core
    workspace.RunNet(pred_net.name)
  File "/home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 236, in RunNet
    StringifyNetName(name), num_iter, allow_fail,
  File "/home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 197, in CallWithExceptionIntercept
    return func(*args, **kwargs)
RuntimeError: [enforce fail at conv_op_cudnn.cc:522] X.dim() >= 3 && X.dim() <= 5. 
Error from operator: 
input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" type: "Conv" arg { name: "stride" i: 2 } arg { name: "pad" i: 0 } arg { name: "kernel" i: 3 } device_option { device_type: 1 device_id: 0 } engine: "CUDNN"frame #0: <unknown function> + 0x3e075 (0x7f3ee48fc075 in /home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #1: std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const + 0x4c (0x7f3ee48fc504 in /home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #2: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void const*) + 0x57 (0x7f3ee48fbbf4 in /home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #3: bool caffe2::CudnnConvOp::DoRunWithType<float, float, float, float>() + 0xf9 (0x7f3ee8b43d25 in /home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #4: caffe2::CudnnConvOp::RunOnDevice() + 0x4e (0x7f3ee8b3c6c2 in /home/snikolaev/anaconda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
lssily commented 5 years ago

Hello! @drnikolaev . When I complement a Conv operation, I meet a similar error. How do you solve it?