OAID / Tengine

Tengine is a lite, high performance, modular inference engine for embedded device
Apache License 2.0
4.62k stars 999 forks source link

tengine retinaface on a311d 的bug #1166

Closed lovehuanhuan closed 3 years ago

lovehuanhuan commented 3 years ago

retinaface量化模型,使用aml tookit转化,运行,时间300ms,但是采用tengine量化,上板运行,时间居然多了几十倍,如下 aml npu drive版本 6.4.0.3

rpdzkj@rpdzkj:~/tending/crossResult/install/bin$ ./tm_retinaface_timvx -m retinaface_quant.tmfile -i 1.jpg tengine-lite library version: 1.5-dev img_h, img_w : 1440, 3840 Repeat 1 times, thread 1, avg time 5934.56 ms, max_time 5934.56 ms, min_time 5934.56 ms

detected face num: 65 BOX 1.00:( 680.977 , 1187.33 ),( 56.4845 , 62.0254 ) BOX 1.00:( 2875.42 , 148.826 ),( 54.8828 , 63.3807 ) BOX 1.00:( 3328.71 , 296.175 ),( 67.8745 , 75.0931 ) BOX 1.00:( 3630.37 , 1011.31 ),( 71.1338 , 78.1379 ) BOX 1.00:( 3478.43 , 121.853 ),( 56.2817 , 62.6993 ) BOX 1.00:( 360.689 , 282.273 ),( 63.1528 , 72.1686 ) BOX 1.00:( 291.368 , 967.927 ),( 37.7346 , 42.4302 ) BOX 1.00:( 2355.37 , 356.097 ),( 69.1099 , 68.1196 ) BOX 1.00:( 90.1891 , 374.825 ),( 68.1196 , 64.0696 ) BOX 1.00:( 845.246 , 347.744 ),( 62.4739 , 80.1402 ) BOX 1.00:( 3297.88 , 195.207 ),( 62.249 , 70.368 ) BOX 1.00:( 533.272 , 219.085 ),( 50.1704 , 55.0807 ) BOX 1.00:( 3265.79 , 130.071 ),( 49.6333 , 57.5095 ) BOX 1.00:( 3605.05 , 261.522 ),( 62.9258 , 67.1439 ) BOX 1.00:( 1060.58 , 248.469 ),( 62.4739 , 70.6224 ) BOX 1.00:( 2884.42 , 235.032 ),( 54.686 , 63.1528 ) BOX 1.00:( 3131.75 , 902.27 ),( 64.5332 , 74.5526 ) BOX 1.00:( 974.201 , 1147.83 ),( 60.6996 , 68.1196 ) BOX 1.00:( 3313.62 , 977.213 ),( 77.2954 , 82.7913 ) BOX 1.00:( 3336.23 , 899.148 ),( 64.0698 , 68.6129 ) BOX 1.00:( 2991.62 , 793.487 ),( 61.1382 , 62.0254 ) BOX 1.00:( 158.56 , 1187.74 ),( 64.5331 , 71.6494 ) BOX 0.99:( 2269.17 , 165.781 ),( 40.0117 , 42.8723 ) BOX 0.99:( 1241.23 , 141.288 ),( 47.543 , 60.0474 ) BOX 0.99:( 739.431 , 140.761 ),( 48.2296 , 55.4781 ) BOX 0.99:( 65.0179 , 255.026 ),( 59.6167 , 64.0696 ) BOX 0.99:( 740.489 , 906.083 ),( 28.7772 , 31.7786 ) BOX 0.99:( 782.916 , 981.789 ),( 31.9971 , 35.8359 ) BOX 0.99:( 153.856 , 927.34 ),( 34.2699 , 35.2251 ) BOX 0.99:( 2625.88 , 408.946 ),( 67.8745 , 83.3923 ) BOX 0.98:( 292.351 , 218.552 ),( 54.49 , 55.6779 ) BOX 0.98:( 2958.51 , 1320.19 ),( 110.356 , 114.434 ) BOX 0.98:( 1175.31 , 459.881 ),( 96.5022 , 113.606 ) BOX 0.98:( 333.978 , 918.075 ),( 28.2902 , 33 ) BOX 0.98:( 1121.21 , 913.52 ),( 28.7771 , 31.9971 ) BOX 0.98:( 1547.76 , 1088.78 ),( 83.3923 , 87.4084 ) BOX 0.97:( 2015.48 , 150.467 ),( 31.7786 , 33.8024 ) BOX 0.97:( 3383.12 , 421.194 ),( 69.1099 , 72.9546 ) BOX 0.97:( 2426.37 , 198.56 ),( 46.1997 , 58.9764 ) BOX 0.97:( 249.312 , 1066.65 ),( 47.7137 , 53.1365 ) BOX 0.97:( 3756.23 , 334.506 ),( 78.7046 , 91.6203 ) BOX 0.97:( 595.854 , 976.851 ),( 36.0833 , 37.8647 ) BOX 0.97:( 1893.8 , 205.172 ),( 35.5903 , 38.6557 ) BOX 0.97:( 2664.24 , 153.372 ),( 49.9907 , 58.9764 ) BOX 0.97:( 2560.26 , 539.831 ),( 88.79 , 109.956 ) BOX 0.97:( 2758.89 , 1177.31 ),( 101.894 , 103.757 ) BOX 0.96:( 235.235 , 882.208 ),( 27.4349 , 30.395 ) BOX 0.96:( 1600.99 , 762.933 ),( 37.7346 , 39.0575 ) BOX 0.96:( 2760.61 , 803.687 ),( 82.1943 , 90.6305 ) BOX 0.95:( 3520.24 , 550.545 ),( 75.3647 , 87.7253 ) BOX 0.94:( 854.548 , 1035.32 ),( 35.2251 , 36.3323 ) BOX 0.93:( 1740.24 , 292.723 ),( 41.4167 , 42.7244 ) BOX 0.93:( 2469.39 , 36.1886 ),( 33.6865 , 33.2272 ) BOX 0.92:( 2101.23 , 115.131 ),( 27.5283 , 30.8139 ) BOX 0.91:( 1284.24 , 835.46 ),( 63.1528 , 62.6993 ) BOX 0.90:( 362.821 , 893.675 ),( 27.8116 , 29.5747 ) BOX 0.90:( 2525.92 , 154.491 ),( 36.7095 , 40.8488 ) BOX 0.89:( 2384.89 , 107.822 ),( 31.6699 , 35.4681 ) BOX 0.89:( 37.0381 , 1052.69 ),( 50.1704 , 54.686 ) BOX 0.88:( 1428.07 , 398.74 ),( 28.4839 , 34.3878 ) BOX 0.87:( 3545.88 , 1139.08 ),( 54.686 , 65.9441 ) BOX 0.86:( 2059.82 , 981.729 ),( 44.0742 , 37.9954 ) BOX 0.86:( 1648.03 , 734.109 ),( 34.7441 , 34.7441 ) BOX 0.85:( 1979.06 , 126.58 ),( 33.2273 , 33.8024 ) BOX 0.82:( 680.553 , 532.237 ),( 78.4208 , 81.8978 )

lovehuanhuan commented 3 years ago

tengine1.4的版本也尝试了,依然是一样,时间6秒左右,比acuity-toolkit慢了几十倍。

nttstar commented 3 years ago

应该是用了CPU推理 没用上NPU

lovehuanhuan commented 3 years ago

最后发现是量化工具 如果是unit8没问题 120ms 如果选unin8-percahnnel 就是6秒 ,检出率高了一点点