tusen-ai / simpledet

A Simple and Versatile Framework for Object Detection and Instance Recognition
Apache License 2.0
3.08k stars 486 forks source link

localbn may conflict with memonger #333

Closed zehuichen123 closed 4 years ago

zehuichen123 commented 4 years ago

When trying to adopt localbn along with memonger, I met an error as follows:

Traceback (most recent call last):
  File "detection_train.py", line 311, in <module>
    train_net(parse_args())
  File "detection_train.py", line 293, in train_net
    profile=profile
  File "/mnt/truenas/scratch/czh/lancher/simpledet_neck/core/detection_module.py", line 1014, in fit
    self.update_metric(eval_metric, data_batch.label)
  File "/mnt/truenas/scratch/czh/lancher/simpledet_neck/core/detection_module.py", line 793, in update_metric
    self._exec_group.update_metric(eval_metric, labels, pre_sliced)
  File "/mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/module/executor_group.py", line 640, in update_metric
    eval_metric.update_dict(labels_, preds)
  File "/mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/metric.py", line 350, in update_dict
    metric.update_dict(labels, preds)
  File "/mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/metric.py", line 133, in update_dict
    self.update(label, pred)
  File "/mnt/truenas/scratch/czh/lancher/simpledet_neck/core/detection_metric.py", line 58, in update
    pred_label = mx.ndarray.argmax_channel(pred).astype('int32').asnumpy().reshape(-1)
  File "/mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/ndarray/ndarray.py", line 2504, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/base.py", line 254, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [11:18:06] src/operator/nn/./cudnn/cudnn_batch_norm-inl.h:87: Check failed: req[cudnnbatchnorm::kOut] == kWriteTo (0 vs. 1) : 
Stack trace:
  [bt] (0) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x32) [0x7fc16f7c7592]
  [bt] (1) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(mxnet::op::CuDNNBatchNormOp<float>::Forward(mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)+0x398) [0x7fc1722e3ea8]
  [bt] (2) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(void mxnet::op::BatchNormCompute<mshadow::gpu>(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)+0xca8) [0x7fc1722d7b88]
  [bt] (3) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(mxnet::exec::FComputeExecutor::Run(mxnet::RunContext, bool)+0x76) [0x7fc17204fa76]
  [bt] (4) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(+0x3c017c0) [0x7fc1720077c0]
  [bt] (5) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x995) [0x7fc171f4fbf5]
  [bt] (6) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(void mxnet::engine::ThreadedEnginePerDevice::GPUWorker<(dmlc::ConcurrentQueueType)0>(mxnet::Context, bool, mxnet::engine::ThreadedEnginePerDevice::ThreadWorkerBlock<(dmlc::ConcurrentQueueType)0>*, std::shared_ptr<dmlc::ManualEvent> const&)+0x11d) [0x7fc171f6862d]
  [bt] (7) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (std::shared_ptr<dmlc::ManualEvent>), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#4}::operator()() const::{lambda(std::shared_ptr<dmlc::ManualEvent>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<dmlc::ManualEvent>&&)+0x4e) [0x7fc171f688de]
  [bt] (8) /mnt/truenas/scratch/hzh/software/simpledet-mxnet/mxnet_xyxy/python/mxnet/../../lib/libmxnet.so(std::thread::_Impl<std::_Bind_simple<std::function<void (std::shared_ptr<dmlc::ManualEvent>)> (std::shared_ptr<dmlc::ManualEvent>)> >::_M_run()+0x4a) [0x7fc171f4e07a]