apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.76k stars 6.8k forks source link

the inference speed using C++ API with mxnet of higher version is slower than lower mxnet #14512

Open PapaMadeleine2022 opened 5 years ago

PapaMadeleine2022 commented 5 years ago

Hello, I have a problem: I use libmxnet.so compiled with mxnetv0.8 comparison to mxnetv1.0(or v1.3 and v1.4) to run my code to infer a batch images using C++ API, I find the inference speed with mxnet of higher version is slower than mxnetv0.8. What causes this problem? How to fix it? anyone can give some advises?

envs: cuda8/cudnn5.1.10/nvidia-driver384.81

mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Performance

pengzhao-intel commented 5 years ago

Could you try the CPU inference which is significantly improved since 0.7?

https://mxnet.incubator.apache.org/versions/master/tutorials/mkldnn/MKLDNN_README.html

wkcn commented 5 years ago

The issue may be similar to https://github.com/apache/incubator-mxnet/issues/13928

However, I do not have any machine with NVIDIA GPU to test it : (

PapaMadeleine2022 commented 5 years ago

@wkcn I dont know which op in my model is slower in newer mxnet. no dilated convolutional layers in my model which is a ocr recognition model with simple cnn and rnn. Can give some detailed test advises?

wkcn commented 5 years ago

@IvyGongoogle Could you please use the profiler to measure the execution time? http://mxnet.incubator.apache.org/versions/master/tutorials/python/profiler.html?highlight=profiler

How much time in your model among different MXNet? Since MXNet supports large tensors now, it may drop a little performance.

PapaMadeleine2022 commented 5 years ago

@wkcn sorry, I can not find some C++ codes about how to use profiler.

wkcn commented 5 years ago

@IvyGongoogle Could you please provide the forward time between different versions of MXNet? Is it obvious?

PapaMadeleine2022 commented 5 years ago

@wkcn thanks for your reply,

mxnet v0.8:
 initialization time(that is `MXPredCreat` function time): 69906ms(the first run)/1416ms/1419ms/1471ms, my model inference time:
218ms/209.3ms/207.6ms/207.4ms
mxnet v1.0.0:
 initialization time(that is `MXPredCreat` function time): 170572ms(the first run)/2779ms/2794ms/2756ms, my model inference time:
218.5ms/218.9ms/215.5ms/224.8ms

as we can see that the initialization time of higher mxnet is nearly double in contrast to mxnet v0.8, and my model inference time is also slower.

sandeep-krishnamurthy commented 5 years ago

@leleamol - Can you please take a look at this issue and suggest why it may be high initialization time with latest MXNet?

@IvyGongoogle - Is it possible to share a minimum reproducible example?

sandeep-krishnamurthy commented 5 years ago

Also, it may be related to this issue - https://github.com/apache/incubator-mxnet/issues/14569 where we have changed few things, like moving to Int64, to support large tensors.