facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.22k stars 5.45k forks source link

RuntimeError: [enforce fail at operator.cc:75] blob != nullptr. op Conv: Encountered a non-existing input blob: gpu_0/old_res3_7_sum #941

Open carryyu opened 4 years ago

carryyu commented 4 years ago

the error(CBNet Version based on Detectron):

[W workspace.cc:170] Blob gpu_0/old_res3_7_sum not in the workspace. WARNING workspace.py: 222: Original python traceback for operator 383 in network generalized_rcnn in exception above (most recent call last): WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/tools/train_net.py", line 133, in WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/tools/train_net.py", line 115, in main WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/utils/train.py", line 53, in train_model WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/utils/train.py", line 145, in create_model WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/model_builder.py", line 127, in create WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/model_builder.py", line 91, in generalized_rcnn WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/model_builder.py", line 259, in build_generic_detection_model WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/optimizer.py", line 40, in build_data_parallel_model WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/optimizer.py", line 63, in _build_forward_graph WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/model_builder.py", line 189, in _single_gpu_build_func WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/FPN.py", line 64, in add_fpn_ResNet101_conv5_body WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/FPN.py", line 112, in add_fpn_onto_conv_body WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/ResNet.py", line 48, in add_ResNet101_conv5_body WARNING workspace.py: 227: File "/home/lzy/diverse/CBNet/detectron/modeling/ResNet.py", line 145, in add_ResNet_convX_body Traceback (most recent call last): File "/home/lzy/diverse/CBNet/tools/train_net.py", line 133, in main() File "/home/lzy/diverse/CBNet/tools/train_net.py", line 115, in main checkpoints = detectron.utils.train.train_model() File "/home/lzy/diverse/CBNet/detectron/utils/train.py", line 58, in train_model setup_model_for_training(model, weights_file, output_dir) File "/home/lzy/diverse/CBNet/detectron/utils/train.py", line 179, in setup_model_for_training workspace.CreateNet(model.net) File "/home/lzy/pytorch/build/caffe2/python/workspace.py", line 181, in CreateNet StringifyProto(net), overwrite, File "/home/lzy/pytorch/build/caffe2/python/workspace.py", line 215, in CallWithExceptionIntercept return func(args, kwargs) RuntimeError: [enforce fail at operator.cc:75] blob != nullptr. op Conv: Encountered a non-existing input blob: gpu_0/old_res3_7_sum frame #0: c10::ThrowEnforceNotMet(char const, int, char const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, void const) + 0x76 (0x7f916475ed36 in /home/lzy/pytorch/build/lib/libc10.so) frame #1: caffe2::OperatorBase::OperatorBase(caffe2::OperatorDef const&, caffe2::Workspace) + 0x3ff (0x7f9144b7bd2f in /home/lzy/pytorch/build/lib/libtorch.so) frame #2: + 0x3f68805 (0x7f914635b805 in /home/lzy/pytorch/build/lib/libtorch.so) frame #3: + 0x3f868eb (0x7f91463798eb in /home/lzy/pytorch/build/lib/libtorch.so) frame #4: + 0x3f8841e (0x7f914637b41e in /home/lzy/pytorch/build/lib/libtorch.so) frame #5: std::_Function_handler<std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > (caffe2::OperatorDef const&, caffe2::Workspace), std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > ()(caffe2::OperatorDef const&, caffe2::Workspace)>::_M_invoke(std::_Any_data const&, caffe2::OperatorDef const&, caffe2::Workspace&&) + 0x23 (0x7f9164bf96a3 in /home/lzy/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so) frame #6: + 0x2786301 (0x7f9144b79301 in /home/lzy/pytorch/build/lib/libtorch.so) frame #7: caffe2::CreateOperator(caffe2::OperatorDef const&, caffe2::Workspace, int) + 0x32a (0x7f9144b7a60a in /home/lzy/pytorch/build/lib/libtorch.so) frame #8: caffe2::dag_utils::prepareOperatorNodes(std::shared_ptr const&, caffe2::Workspace) + 0x17f3 (0x7f9144b74b93 in /home/lzy/pytorch/build/lib/libtorch.so) frame #9: caffe2::AsyncNetBase::AsyncNetBase(std::shared_ptr const&, caffe2::Workspace) + 0x246 (0x7f9144b8c026 in /home/lzy/pytorch/build/lib/libtorch.so) frame #10: caffe2::AsyncSchedulingNet::AsyncSchedulingNet(std::shared_ptr const&, caffe2::Workspace) + 0x9 (0x7f9144bb6989 in /home/lzy/pytorch/build/lib/libtorch.so) frame #11: + 0x27c5e2e (0x7f9144bb8e2e in /home/lzy/pytorch/build/lib/libtorch.so) frame #12: std::_Function_handler<std::unique_ptr<caffe2::NetBase, std::default_deletecaffe2::NetBase > (std::shared_ptr const&, caffe2::Workspace), std::unique_ptr<caffe2::NetBase, std::default_deletecaffe2::NetBase > ()(std::shared_ptr const&, caffe2::Workspace)>::_M_invoke(std::_Any_data const&, std::shared_ptr const&, caffe2::Workspace&&) + 0x23 (0x7f9144bb8ce3 in /home/lzy/pytorch/build/lib/libtorch.so) frame #13: caffe2::CreateNet(std::shared_ptr const&, caffe2::Workspace) + 0x847 (0x7f9144bc3117 in /home/lzy/pytorch/build/lib/libtorch.so) frame #14: caffe2::Workspace::CreateNet(std::shared_ptr const&, bool) + 0x13c (0x7f9144bdf24c in /home/lzy/pytorch/build/lib/libtorch.so) frame #15: caffe2::Workspace::CreateNet(caffe2::NetDef const&, bool) + 0x9f (0x7f9144be094f in /home/lzy/pytorch/build/lib/libtorch.so) frame #16: + 0x51f70 (0x7f9164beef70 in /home/lzy/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so) frame #17: + 0x521de (0x7f9164bef1de in /home/lzy/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so) frame #18: + 0x99160 (0x7f9164c36160 in /home/lzy/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so)

frame #36: __libc_start_main + 0xf0 (0x7f9168059830 in /lib/x86_64-linux-gnu/libc.so.6) frame #37: + 0x107f (0x55e423b0507f in /home/lzy/anaconda2/envs/lzy/bin/python)

carryyu commented 4 years ago

How to change the node name?

vandesa003 commented 4 years ago

Same error happened to me. Have solved?

mamunir commented 4 years ago

Hi. Any clue to solve the issue? Have same problem

hw446 commented 4 years ago

It should be x152 backbone. If you want to use other backbone, see https://github.com/PKUbahuangliuhe/CBNet/issues/3