tusen-ai / simpledet

A Simple and Versatile Framework for Object Detection and Instance Recognition
Apache License 2.0
3.08k stars 488 forks source link

memonger_v2 bug, Check failed: req[cudnnbatchnorm::kOut] == kWriteTo (0 vs. 1) #262

Closed nttstar closed 4 years ago

nttstar commented 4 years ago

MXNet version: pip install mxnet-cu80(1.5.0), OS: CentOS7 I try to use memonger_v2 in insightface(train_parall.py under insightface/recognition) as below:

from memonger_v2 import search_plan_to_layer

def get_symbol_embedding():                                                                                                                                                                                                       
    embedding = eval(config.net_name).get_symbol()                                                                                                                                                                                  
    if config.memonger:                                                                                                                                                                                                             
      worker_data_shape = {'data' : (config.per_batch_size, 3, 112, 112)}                                                                                                                                                         
      type_dict = {k: np.float32 for k in worker_data_shape}                                                                                                                                                                      
      last_block = ""                                                                                                                                                                                                             
      embedding = search_plan_to_layer(embedding, last_block, 1000, type_dict=type_dict, **worker_data_shape)
    ......

embedding is a dim=512 feature reduction layer after stride=16 feature map from resnet. All activation layer used in resnet is PReLU.

The memonger output is as below:

Search threshold=0 MB, cost=3497 MB
Search threshold=774 MB, cost=3810 MB
Search threshold=334 MB, cost=2854 MB
Find best plan with threshold=334, cost=2854 MB
Old feature map cost=7120 MB
New feature map cost=2854 MB
Search threshold=1000 MB, cost=2854 MB
Search threshold=300 MB, cost=2854 MB
Search threshold=600 MB, cost=2854 MB
Search threshold=900 MB, cost=2854 MB
Search threshold=1200 MB, cost=2854 MB
Search threshold=1500 MB, cost=2854 MB
Search threshold=1800 MB, cost=2854 MB
Search threshold=2100 MB, cost=2854 MB
Search threshold=2400 MB, cost=2854 MB
Search threshold=2700 MB, cost=2854 MB
Search threshold=707 MB, cost=2854 MB
Search threshold=1414 MB, cost=2854 MB
Search threshold=2121 MB, cost=2854 MB
Search threshold=2828 MB, cost=2854 MB
Search threshold=3535 MB, cost=2854 MB
Search threshold=4242 MB, cost=2854 MB
Find best plan with threshold=1000, cost=2854 MB

And also the error msg:

mxnet.base.MXNetError: [11:48:54] src/operator/nn/./cudnn/cudnn_batch_norm-inl.h:85: Check failed: req[cudnnbatchnorm::kOut] == kWriteTo (0 vs. 1) :