Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.44k stars 4.16k forks source link

slowly inference happened in costomized simple model #4412

Open ChristopheZhao opened 1 year ago

ChristopheZhao commented 1 year ago

error log | 日志或报错信息 | ログ

model | 模型 | モデル

  1. original model I use ncnn convert a simple model from onnx that almost like pointpillars,But It's very slow in inference test,it took 4s to extract a output on two cores of A53,In order to find the factor that cause this phenomenon,I only use a simple model that consist 4 conv adn 4 relu,and i found the speed is slow too,It took 500ms to extract the output of last relu layer.By contrast,the squeeze net only took 75ms can accomplish the complete inference on the same env,so i wonder is there any thing wrong in my customize model.

image

how to reproduce | 复现步骤 | 再現方法

1.the code of using squeeze: `ncnn::Net squeezenet; ncnn::Mat input_data = ncnn::Mat(227, 227, 3);

const char* param_file = "../squeezenet_v1.1.param";
const char* bin_file = "../squeezenet_v1.1.bin";

squeezenet.load_param(param_file);
squeezenet.load_model(bin_file);

ncnn::Extractor ex = squeezenet.create_extractor();

ex.input("data", input_data);

ncnn::Mat out;
ex.extract("prob", out);`

time cost:70ms.

2.the code of using costomize model: `string voxel_res_file = "../input_sample_0_reshape.npy";

std::vector<unsigned long> in_shape;
std::vector<float> in_data;
load_numpy_array(voxel_res_file,in_data,in_shape);

int w = 1000;
int h = 200 ;
int d = 1;
int c = 64;
std::cout << "load numpy data success!" << std::endl;

std::cout << "in_data size 0 = " << in_data.size() <<std::endl;

ncnn::Mat input_data = ncnn::Mat(w, h, c,in_data.data());

ncnn::Net squeezenet;
const char* param_file = "../end2end_nn_1b.param";
const char* bin_file = "../end2end_nn_1b.bin";

squeezenet.load_param(param_file);
squeezenet.load_model(bin_file);

for(int i=0;i<3;i++){

    ncnn::Extractor ex = squeezenet.create_extractor();

    auto model_infer_start = std::chrono::steady_clock::now();
    ncnn::Mat scores;

    ex.input("pc_fatures", input_data);

    std::cout << "initia mat Elapsed(ms)=" << since(model_infer_start).count() << std::endl;

    ex.extract("262", scores);
    std::cout << "extract_conv_125 Elapsed(ms)=" << since(model_infer_start).count() << std::endl;

}`

average time cost of each infer:500ms.

and the simple model i use is really small,the param file as below : ''' 7767517 5 5 Input data 0 1 data -23330=4,3,1000,200,64 0=1000 1=200 2=64 Convolution Conv_119 1 1 data 253 -23330=4,3,500,100,64 0=64 1=3 3=2 4=1 5=1 6=36864 9=1 Convolution Conv_121 1 1 253 256 -23330=4,3,500,100,64 0=64 1=3 4=1 5=1 6=36864 9=1 Convolution Conv_123 1 1 256 259 -23330=4,3,500,100,64 0=64 1=3 4=1 5=1 6=36864 9=1 Convolution Conv_125 1 1 259 262 -23330=4,3,500,100,64 0=64 1=3 4=1 5=1 6=36864 9=1

'''

Can you give me some advise that guide me to find the factor cause the slowly infrence in such sipmle model.

ChristopheZhao commented 1 year ago

I found JMF1108 has mentioned it in this issues( https://github.com/Tencent/ncnn/issues/3789 ),I think i met the same problem with it, someone Has this problem been fixed already?

nihui commented 3 months ago

针对onnx模型转换的各种问题,推荐使用最新的pnnx工具转换到ncnn In view of various problems in onnx model conversion, it is recommended to use the latest pnnx tool to convert your model to ncnn

pip install pnnx
pnnx model.onnx inputshape=[1,3,224,224]

详细参考文档 Detailed reference documentation https://github.com/pnnx/pnnx https://github.com/Tencent/ncnn/wiki/use-ncnn-with-pytorch-or-onnx#how-to-use-pnnx