Open hseok-oh opened 4 years ago
tflite(float) | cpu(float) | tflite (quint8) | cpu(quint8) | |
---|---|---|---|---|
arithmetic | 52.919 | 6.172 | 49.005 | 10.37 |
comparision | 45.864 | 27.61 | 188.36 | 286.572 |
tensor000 | 133.531 | 19.704 | 62.275 | 21.853 |
tensor001 | 327.683 | 219.268 | 53.995 | 175.288 |
unary | 159.103 | 167.817 | 318.581 | 306.664 |
inception_v3 | 2755.685 | 1491.922 | 1188.801 | 1062.103 |
inception_v4 | 5648.375 | 2854.986 | 2861.491 | 2202.942 |
mobilenet_v1_1.0_224 | 383.965 | 391.332 | 252.476 | 195.974 |
tflite(float) | cpu(float) | tflite (quint8) | cpu(quint8) | |
---|---|---|---|---|
arithmetic | 1 | 8.57 | 1.08 | 5.10 |
comparision | 1 | 1.66 | 0.24 | 0.16 |
tensor000 | 1 | 6.78 | 2.14 | 6.11 |
tensor001 | 1 | 1.49 | 6.07 | 1.87 |
unary | 1 | 0.95 | 0.50 | 0.52 |
inception_v3 | 1 | 1.85 | 2.32 | 2.59 |
inception_v4 | 1 | 1.98 | 1.97 | 2.56 |
mobilenet_v1_1.0_224 | 1 | 0.98 | 1.52 | 1.96 |
Geomean | 1 | 2.17 | 1.36 | 1.68 |
tflite (quint8) | cpu(quint8) | |||
---|---|---|---|---|
arithmetic | 1 | 4.73 | ||
comparision | 1 | 0.66 | ||
tensor000 | 1 | 2.85 | ||
tensor001 | 1 | 0.31 | ||
unary | 1 | 1.04 | ||
inception_v3 | 1 | 1.12 | ||
inception_v4 | 1 | 1.30 | ||
mobilenet_v1_1.0_224 | 1 | 1.29 | ||
Geomean | 1.00 | 1.23 |
tflite(float) | cpu(float) | tflite (quint8) | cpu(quint8) | |
---|---|---|---|---|
arithmetic | 25052 | 21280 | 9336 | 9932 |
comparision | 26632 | 22944 | 13040 | 13844 |
tensor000 | 22116 | 21452 | 8596 | 10092 |
tensor001 | 41380 | 38020 | 13416 | 14208 |
unary | 22148 | 21292 | 8700 | 10108 |
inception_v3 | 208616 | 116324 | 28356 | 39776 |
inception_v4 | 352216 | 187732 | 49168 | 59220 |
mobilenet_v1_1.0_224 | 45324 | 32148 | 7612 | 12716 |
tflite(float) | cpu(float) | tflite (quint8) | cpu(quint8) | |
---|---|---|---|---|
arithmetic | 100% | 85% | 37% | 40% |
comparision | 100% | 86% | 49% | 52% |
tensor000 | 100% | 97% | 39% | 46% |
tensor001 | 100% | 92% | 32% | 34% |
unary | 100% | 96% | 39% | 46% |
inception_v3 | 100% | 56% | 14% | 19% |
inception_v4 | 100% | 53% | 14% | 17% |
mobilenet_v1_1.0_224 | 100% | 71% | 17% | 28% |
Geomean | 100% | 78% | 27% | 33% |
tflite (quint8) | cpu(quint8) | |||
---|---|---|---|---|
arithmetic | 100% | 106% | ||
comparision | 100% | 106% | ||
tensor000 | 100% | 117% | ||
tensor001 | 100% | 106% | ||
unary | 100% | 116% | ||
inception_v3 | 100% | 140% | ||
inception_v4 | 100% | 120% | ||
mobilenet_v1_1.0_224 | 100% | 167% | ||
Geomean | 100% | 121% |
model file: http://npu.mooo.com/archive/nnpkg_test_model/nnpkg_quant.tar.gz
FULLY_CONNECTED
, SOFTMAX
)DEPTHWISE_CONV_2D
)Total: 36 operations
model file unzip:
nnpkg
├── float
│ ├── inception_v3
│ ├── inception_v4
│ ├── mobilenet
│ ├── Model_Arithmetic
│ ├── Model_Comparison
│ ├── Model_Tensor_000
│ ├── Model_Tensor_001
│ └── Model_Unary
└── quant
├── inception_v3_quant
├── inception_v4_quant
├── mobilenet_quant
├── Model_Arithmetic_U8
├── Model_Comparison_U8
├── Model_Tensor_U8_000
├── Model_Tensor_U8_001
└── Model_Unary_U8
nnpkg/float
: FLOAT
I/O model
nnpkg/quant
: UINT8
(quantized) I/O model
$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/Model_Arithmetic_U8/Model_Arithmetic_U8.tflite
#0 b'main' (MAIN) input tensors: [0 1]
Tensor 0 : buffer 1 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ifm1')
Tensor 1 : buffer 2 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ifm2')
#0 b'main' (MAIN) output tensors: [2 3 4]
Tensor 2 : buffer 3 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_add')
Tensor 3 : buffer 4 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_sub')
Tensor 4 : buffer 5 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_mul')
(operations)
==== Model Stats (1 Subgraphs) ====
Number of all operator types: 3
ADD : 1
MUL : 1
SUB : 1
Number of all operators : 3
$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/Model_Comparison_U8/Model_Comparison_U8.tflite
#0 b'main' (MAIN) input tensors: [0 1]
Tensor 0 : buffer 1 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ifm1')
Tensor 1 : buffer 2 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ifm2')
#0 b'main' (MAIN) output tensors: [2 3 4 5 6 7 8 9]
Tensor 2 : buffer 3 | Empty | BOOL | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_eq')
Tensor 3 : buffer 4 | Empty | BOOL | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_gt')
Tensor 4 : buffer 5 | Empty | BOOL | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_ge')
Tensor 5 : buffer 6 | Empty | BOOL | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_lt')
Tensor 6 : buffer 7 | Empty | BOOL | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_le')
Tensor 7 : buffer 8 | Empty | BOOL | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_ne')
Tensor 8 : buffer 9 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_max')
Tensor 9 : buffer 10 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'ofm_min')
(operations)
==== Model Stats (1 Subgraphs) ====
Number of all operator types: 8
EQUAL : 1
GREATER : 1
GREATER_EQUAL : 1
LESS : 1
LESS_EQUAL : 1
MAXIMUM : 1
MINIMUM : 1
NOT_EQUAL : 1
Number of all operators : 8
$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/Model_Tensor_U8_000/Model_Tensor_U8_000.tflite -v 0
#0 b'main' (MAIN) input tensors: [0]
Tensor 0 : buffer 1 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'input')
#0 b'main' (MAIN) output tensors: [ 2 4 5 7 8 9 10 12]
Tensor 2 : buffer 2 | Empty | UINT8 | Memory 767.3K | Shape [1, 322, 244, 10] (b'output_pad')
Tensor 4 : buffer 3 | Empty | UINT8 | Memory 767.3K | Shape [1, 322, 244, 10] (b'output_pad2')
Tensor 5 : buffer 4 | Empty | INT32 | Memory 16.0B | Shape [4] (b'output_shape')
Tensor 7 : buffer 5 | Empty | UINT8 | Memory 187.5K | Shape [1, 320, 60, 10] (b'output_split1')
Tensor 8 : buffer 6 | Empty | UINT8 | Memory 187.5K | Shape [1, 320, 60, 10] (b'output_split2')
Tensor 9 : buffer 7 | Empty | UINT8 | Memory 187.5K | Shape [1, 320, 60, 10] (b'output_split3')
Tensor 10 : buffer 8 | Empty | UINT8 | Memory 187.5K | Shape [1, 320, 60, 10] (b'output_split4')
Tensor 12 : buffer 9 | Empty | UINT8 | Memory 750.0K | Shape [1, 240, 320, 10] (b'output_transpose')
(operations)
==== Model Stats (1 Subgraphs) ====
Number of all operator types: 5
PAD : 1
PADV2 : 1
SHAPE : 1
SPLIT : 1
TRANSPOSE : 1
Number of all operators : 5
$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/Model_Tensor_U8_001/Model_Tensor_U8_001.tflite
#0 b'main' (MAIN) input tensors: [0 4]
Tensor 0 : buffer 1 | Empty | UINT8 | Memory 750.0K | Shape [4, 160, 120, 10] (b'input')
Tensor 4 : buffer 2 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'input2')
#0 b'main' (MAIN) output tensors: [ 3 6 8 11 13 14]
Tensor 3 : buffer 3 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'output_batch_to_space_nd')
Tensor 6 : buffer 4 | Empty | UINT8 | Memory 25.0K | Shape [1, 320, 8, 10] (b'output_gather')
Tensor 8 : buffer 5 | Empty | UINT8 | Memory 2.9M | Shape [1, 640, 480, 10] (b'output_resize_bilinear')
Tensor 11 : buffer 6 | Empty | UINT8 | Memory 93.8K | Shape [1, 80, 120, 10] (b'output_slice')
Tensor 13 : buffer 7 | Empty | UINT8 | Memory 750.0K | Shape [4, 160, 120, 10] (b'output_space_to_batch_nd')
Tensor 14 : buffer 8 | Empty | UINT8 | Memory 750.0K | Shape [1, 160, 120, 40] (b'output_space_to_depth')
(operations)
==== Model Stats (1 Subgraphs) ====
Number of all operator types: 6
BATCH_TO_SPACE_ND : 1
GATHER : 1
RESIZE_BILINEAR : 1
SLICE : 1
SPACE_TO_BATCH_ND : 1
SPACE_TO_DEPTH : 1
Number of all operators : 6
$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/Model_Unary_U8/Model_Unary_U8.tflite
#0 b'main' (MAIN) input tensors: [0]
Tensor 0 : buffer 1 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'input')
#0 b'main' (MAIN) output tensors: [1 2 3 4 5 6]
Tensor 1 : buffer 2 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'output_l2_norm')
Tensor 2 : buffer 3 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'output_log_softmax')
Tensor 3 : buffer 4 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'output_logistic')
Tensor 4 : buffer 5 | Empty | UINT8 | Memory 750.0K | Shape [1, 320, 240, 10] (b'output_tanh')
Tensor 5 : buffer 6 | Empty | UINT8 | Memory 10.0B | Shape [1, 10] (b'output_reduce_mean')
Tensor 6 : buffer 7 | Empty | UINT8 | Memory 10.0B | Shape [1, 10] (b'output_reduce_sum')
(operations)
==== Model Stats (1 Subgraphs) ====
Number of all operator types: 6
L2_NORMALIZATION : 1
LOGISTIC : 1
LOG_SOFTMAX : 1
MEAN : 1
SUM : 1
TANH : 1
Number of all operators : 6
Expected TOTAL memory: 3.7M
Expected FILLED memory: 8.0B
$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/inception_v3_quant/inception_v3_quant.tflite
#0 None (MAIN) input tensors: [315]
Tensor 315 : buffer 257 | Empty | UINT8 | Memory 261.9K | Shape [1, 299, 299, 3] (b'input')
#0 None (MAIN) output tensors: [316]
Tensor 316 : buffer 247 | Empty | UINT8 | Memory 1001.0B | Shape [1, 1001] (b'output')
(operations)
Number of all operator types: 5
AVERAGE_POOL_2D : 10
CONCATENATION : 15
CONV_2D : 95
MAX_POOL_2D : 4
RESHAPE : 1
Number of all operators : 125
FULLY_CONNECTED
, SOFTMAX
)$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/inception_v4_quant/inception_v4_299_quant.tflite
#0 None (MAIN) input tensors: [495]
Tensor 495 : buffer 374 | Empty | UINT8 | Memory 261.9K | Shape [1, 299, 299, 3] (b'input')
#0 None (MAIN) output tensors: [494]
Tensor 494 : buffer 256 | Empty | UINT8 | Memory 1001.0B | Shape [1, 1001] (b'InceptionV4/Logits/Predictions')
(operations)
==== Model Stats (1 Subgraphs) ====
Number of all operator types: 6
AVERAGE_POOL_2D : 15
CONCATENATION : 25
CONV_2D : 149
FULLY_CONNECTED : 1
MAX_POOL_2D : 4
SOFTMAX : 1
Number of all operators : 195
DEPTHWISE_CONV_2D
)$ python3 tools/tflitefile_tool/model_parser.py nnpkg/quant/mobilenet_quant/mobilenet_v1_1.0_224_quant.tflite
#0 None (MAIN) input tensors: [88]
Tensor 88 : buffer 47 | Empty | UINT8 | Memory 147.0K | Shape [1, 224, 224, 3] (b'input')
#0 None (MAIN) output tensors: [87]
Tensor 87 : buffer 65 | Empty | UINT8 | Memory 1001.0B | Shape [1, 1001] (b'MobilenetV1/Predictions/Reshape_1')
(operations)
Number of all operator types: 5
AVERAGE_POOL_2D : 1
CONV_2D : 15
DEPTHWISE_CONV_2D : 13
RESHAPE : 1
SOFTMAX : 1
Number of all operators : 31
Expected TOTAL memory: 9.0M
Expected FILLED memory: 4.1M
No | OP \ Model | arithmetic | comparision | tensor000 | tensor001 | unary | inception v3 | inception v4 | mobilenet |
---|---|---|---|---|---|---|---|---|---|
1 | ADD | O | |||||||
2 | AVERAGE_POOL_2D | O | O | O | |||||
3 | BATCH_TO_SPACE_ND | O | |||||||
4 | CONCATENATION | O | O | ||||||
5 | CONV_2D | O | O | O | |||||
6 | DEPTHWISE_CONV_2D | O | |||||||
7 | EQUAL | O | |||||||
8 | FULLY_CONNECTED | O | |||||||
9 | GATHER | O | |||||||
10 | GREATER | O | |||||||
11 | GREATER_EQUAL | O | |||||||
12 | L2_NORMALIZATION | O | |||||||
13 | LESS | O | |||||||
14 | LESS_EQUAL | O | |||||||
15 | LOG_SOFTMAX | O | |||||||
16 | LOGISTIC | O | |||||||
17 | MAX_POOL_2D | O | O | ||||||
18 | MAXIMUM | O | |||||||
19 | MEAN | O | |||||||
20 | MINIMUM | O | |||||||
21 | MUL | O | |||||||
22 | NOT_EQUAL | O | |||||||
23 | PAD | O | |||||||
24 | PADV2 | O | |||||||
25 | RESHAPE | O | O | O | |||||
26 | RESIZE_BILINEAR | O | |||||||
27 | SHAPE | O | |||||||
28 | SLICE | O | |||||||
29 | SOFTMAX | O | O | ||||||
30 | SPACE_TO_BATCH_ND | O | |||||||
31 | SPACE_TO_DEPTH | O | |||||||
32 | SPLIT | O | |||||||
33 | SUB | O | |||||||
34 | SUM | O | |||||||
35 | TANH | O | |||||||
36 | TRANSPOSE | O |
Test model
Test setting
nnpackage_run
withoutHDF5
linkingtflite_run
(THREAD= 4
)Performance result (ubuntu 18.04)
Execution time
Comparison with tflite(float)
Comparison with tflite(quant)
Memoery usage
Usage (KB)
Comparison with tflite(float)
Comparison with tflite(quant)
Result