notAI-tech / NudeNet

Lightweight nudity detection
https://nudenet.notai.tech/
GNU Affero General Public License v3.0
1.76k stars 342 forks source link

ONNX model poor performance #94

Open dimjava opened 3 years ago

dimjava commented 3 years ago

As there is Xception used in NudeNet, it should roughly perform twice slower than resnet50 (page 5 of the article). You can check this by using pretrainedmodels module.

For NudeNet's keras model it holds, but onnx model is 5+ times slower. (I used batch of 32 images propagated 20 times throug the network.).

However, if I convert this keras model to onnx using keras2onnx, the performance is OK as well. So the problem may be in conversion method.

Do you have any ideas why this happens ? What method did you use to convert keras model to onnx ?

cuda==11.1 GTX 1060 6gb, onnxruntime-gpu==1.7.0 Tested on GTX 2080 Ti, onnxruntime-gpu==1.4.0 as well.

dimjava commented 3 years ago

Attached 2 profiles ("orig-" is NudeNet's onnx model, "converted-" is the one I converted from keras model).

You can see the first warming up model_run function takes pretty much the same time. But after that, original model takes 760ms per batch, but converted only 210ms.

The most time in the original model consume some unclear Memcpytoken* operations.

orig-onnxruntime_profile__2021-04-27_10-36-05.txt converted-onnxruntime_profile__2021-04-27_10-35-05.txt