facebookarchive / models

A repository for storing pre-trained Caffe2 models.
Apache License 2.0
430 stars 127 forks source link

cannot run mobilenet_v2_quantized on pytorch/caffe2 #52

Open baynaa7 opened 6 years ago

baynaa7 commented 6 years ago

I am trying to run mobilenet_v2_quantized on pytorch/caffe2 repo which supports int8 .

after compiling pytorch on tx2 and run the model, I meet the following error:


WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:caffe2.python.workspace:Original python traceback for operator `-2110861432` in network `mobilenet_v2_quant` in exception above (most recent call last):
Traceback (most recent call last):
  File "test_caffe2.py", line 470, in <module>
    net_def = createNet(pred_net)
  File "test_caffe2.py", line 406, in createNet
    workspace.CreateNet(net_def, overwrite=True)
  File "/home/gg/pytorch/build/caffe2/python/workspace.py", line 154, in CreateNet
    StringifyProto(net), overwrite,
  File "/home/gg/pytorch/build/caffe2/python/workspace.py", line 180, in CallWithExceptionIntercept
    return func(*args, **kwargs)
RuntimeError: [enforce fail at operator.cc:46] blob != nullptr. op NCHW2NHWC: Encountered a non-existing input blob: data
frame #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void const*) + 0x7c (0x7f5ac89ed4 in /home/gg/pytorch/build/lib/libc10.so)
frame #1: caffe2::OperatorBase::OperatorBase(caffe2::OperatorDef const&, caffe2::Workspace*) + 0x560 (0x7f5bc5fa18 in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #2: <unknown function> + 0x13e5efc (0x7f5c088efc in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #3: std::_Function_handler<std::unique_ptr<caffe2::OperatorBase, std::default_delete<caffe2::OperatorBase> > (caffe2::OperatorDef const&, caffe2::Workspace*), std::unique_ptr<caffe2::OperatorBase, std::default_delete<caffe2::OperatorBase> > (*)(caffe2::OperatorDef const&, caffe2::Workspace*)>::_M_invoke(std::_Any_data const&, caffe2::OperatorDef const&, caffe2::Workspace*&&) + 0x34 (0x7f5cb5dd9c in /home/gg/pytorch/build/caffe2/python/caffe2_pybind11_state.cpython-35m-aarch64-linux-gnu.so)
frame #4: <unknown function> + 0xfbacfc (0x7f5bc5dcfc in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #5: <unknown function> + 0xfbcd0c (0x7f5bc5fd0c in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #6: caffe2::CreateOperator(caffe2::OperatorDef const&, caffe2::Workspace*, int) + 0x430 (0x7f5bc60898 in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #7: caffe2::SimpleNet::SimpleNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x3dc (0x7f5bcc016c in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #8: <unknown function> + 0x101eb3c (0x7f5bcc1b3c in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #9: <unknown function> + 0xfa5904 (0x7f5bc48904 in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #10: caffe2::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x90c (0x7f5bc98f94 in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #11: caffe2::Workspace::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, bool) + 0x1e4 (0x7f5bcab824 in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #12: caffe2::Workspace::CreateNet(caffe2::NetDef const&, bool) + 0xa4 (0x7f5bcaca7c in /home/gg/pytorch/build/lib/libcaffe2.so)
frame #13: <unknown function> + 0x51060 (0x7f5cb55060 in /home/gg/pytorch/build/caffe2/python/caffe2_pybind11_state.cpython-35m-aarch64-linux-gnu.so)
frame #14: <unknown function> + 0x512d0 (0x7f5cb552d0 in /home/gg/pytorch/build/caffe2/python/caffe2_pybind11_state.cpython-35m-aarch64-linux-gnu.so)
frame #15: <unknown function> + 0x8edfc (0x7f5cb92dfc in /home/gg/pytorch/build/caffe2/python/caffe2_pybind11_state.cpython-35m-aarch64-linux-gnu.so)
<omitting python frames>

what is the problem?

TerryTsao commented 5 years ago

Same here. Can't run it. I've posted it here.

https://discuss.pytorch.org/t/caffe2-mobilenetv2-quantized-using-caffe2-blobistensortype-blob-cpu-blob-is-not-a-cpu-tensor-325/29065

My error msg is different, though.

What's interesting is that in a desperate attempt, I downloaded resnet-50 quantized and run the same script. I got another different error msg, a bit similar to what @pcub showed above.

I've tried both on TX2 and Rasp pi, btw. Same error msg. Hoping someone can give some pointers.

newstzpz commented 5 years ago

I think the reason is that workspace.CreateNet requires all blobs existed at the time of creating the network, but the input blob "data" was not existed at that time. One solution is to call workspace.CreateBlob("data") before calling createNet.

baynaa7 commented 5 years ago

idea that @newstzpz mentioned was solved the problem

TerryTsao commented 5 years ago

@pcub So I'm guessing I shouldn't have followed the tutorial, since your way seems different than mine and works now. Is it possible to show me some code snippet to see how it's done? I have zero Caffe2 experience, so I'd really appreciate that.