Closed gasgallo closed 4 years ago
@gasgallo We could not find the cause of the error in the log, Please reference this debug-with-crash and get the crash stacks.
@lu229
--debug_mode
doesn't output anythingadb logcat
output is the following
2020-01-16 11:59:56.304 7326-7326/? A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7edce00000 in tid 7326 (mace_run_static), pid 7326 (mace_run_static)
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: Build fingerprint: 'Xiaomi/cepheus/cepheus:10/QKQ1.190825.002/V11.0.3.0.QFAMIXM:user/release-keys'
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: Revision: '0'
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: ABI: 'arm64'
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: Timestamp: 2020-01-16 11:59:56+0700
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: pid: 7326, tid: 7326, name: mace_run_static >>> /data/local/tmp/mace_run/mace_run_static <<<
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: uid: 2000
2020-01-16 11:59:56.330 7335-7335/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7edce00000
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x0 0000007edcaf6000 x1 0000000000000000 x2 ffffffffffca0e38 x3 0000007edce00000
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x4 0000007edcaa0eb8 x5 0000000000000004 x6 64636e6bff646c60 x7 7f7f7f7f7f7f7f7f
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x8 0000000000000000 x9 7fa3e82c9ac2207c x10 7fa3e82c9ac2207c x11 00000000fffffffd
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x12 0000000000000001 x13 0000000000000031 x14 0000000000000020 x15 aaaaaaaaaaaaaaab
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x16 0000005567ffb0d8 x17 0000007edcf6f780 x18 0000000000000000 x19 0000005567bc5a44
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x20 0000007fe02091b8 x21 0000007fe0209278 x22 0000000000000017 x23 0000000000000000
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: x28 0000000000000000 x29 0000007fe0202780
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: sp 0000007fe0202770 lr 0000005567c02828 pc 0000007edcf6f880
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: backtrace:
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: NOTE: Function names and BuildId information is missing for some frames due
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: NOTE: to unreadable libraries. For unwinds of apps, only shared libraries
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: NOTE: found under the lib/ directory are readable.
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #00 pc 000000000006f880 /apex/com.android.runtime/lib64/bionic/libc.so (memset+256) (BuildId: 084c8a81b8c78e19cd9a1ff6208e77cf)
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #01 pc 000000000006a824 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #02 pc 00000000000c7cb8 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #03 pc 00000000000cbd48 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #04 pc 00000000001ebd08 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #05 pc 00000000000677c0 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #06 pc 00000000000682a4 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #07 pc 000000000002a5fc /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #08 pc 000000000002d264 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #09 pc 000000000002da60 /data/local/tmp/mace_run/mace_run_static
2020-01-16 11:59:56.331 7335-7335/? A/DEBUG: #10 pc 000000000006ebc4 /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+108) (BuildId: 084c8a81b8c78e19cd9a1ff6208e77cf)
2020-01-16 11:59:56.345 7335-7335/? E/crash_dump64: cannot open libmiuindbg.so: No such file or directory
2020-01-16 11:59:56.346 7335-7335/? E/crash_dump64: AM data write failed: Broken pipe
2020-01-16 11:59:56.347 1249-1249/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_06
@gasgallo The stacks has no detail info, please reference debug-with-crash and set the correct symbol path for the ndk-stack, then the ndk-stack will print the stack's detail info.
@lu229 what's the correct path? I'm currently setting the folder where MACE copies all it's files:
adb logcat | $ANDROID_NDK_HOME/ndk-stack -sym /data/local/tmp/mace_run
but there's no detailed info anyway:
********** Crash dump: **********
Build fingerprint: 'Xiaomi/cepheus/cepheus:10/QKQ1.190825.002/V11.0.3.0.QFAMIXM:user/release-keys'
pid: 8085, tid: 8085, name: mace_run_static >>> /data/local/tmp/mace_run/mace_run_static <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7a6fe00000
Stack frame #00 pc 000000000006f880 /apex/com.android.runtime/lib64/bionic/libc.so (memset+256) (BuildId: 084c8a81b8c78e19cd9a1ff6208e77cf)
Stack frame #01 pc 000000000006c168 /data/local/tmp/mace_run/mace_run_static
Stack frame #02 pc 00000000001362e0 /data/local/tmp/mace_run/mace_run_static
Stack frame #03 pc 000000000003e854 /data/local/tmp/mace_run/mace_run_static
Stack frame #04 pc 000000000001c848 /data/local/tmp/mace_run/mace_run_static
Stack frame #05 pc 000000000001ee0c /data/local/tmp/mace_run/mace_run_static
Stack frame #06 pc 000000000001f970 /data/local/tmp/mace_run/mace_run_static
Stack frame #07 pc 000000000006ebc4 /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+108) (BuildId: 084c8a81b8c78e19cd9a1ff6208e77cf)
Crash dump is completed
If I build the model and run with target armeabi-v7a
the log is different:
Generate input file: builds/model/_tmp/sp/8e5e6edf84635516eaabb3f63c6e7dbe/MI9_msmnile/armeabi-v7a/model_input_data
Generate input file done.
* Run 'sp' with round=1, restart_round=1, tuning=False, out_of_range_check=False, omp_num_threads=(-1,), cpu_affinity_policy=(1,), gpu_perf_hint=(3,), gpu_priority_hint=(3,)
Push builds/model/_tmp/sp/8e5e6edf84635516eaabb3f63c6e7dbe/MI9_msmnile/armeabi-v7a/model_input_data to /data/local/tmp/mace_run
Push builds/model/model/sp.data to /data/local/tmp/mace_run
Push builds/model/model/sp.pb to /data/local/tmp/mace_run/sp.pb
Push builds/model/_tmp/armeabi-v7a/mace_run_static to /data/local/tmp/mace_run
Push /tmp/cmd_file-sp-1579158073.53 to /data/local/tmp/mace_run/cmd_file-sp-1579158073.53
I mace/tools/validation/mace_run.cc:451] model name: sp
I mace/tools/validation/mace_run.cc:452] mace version: v0.11.0-rc0-0-g2d650b6
I mace/tools/validation/mace_run.cc:453] input node: data
I mace/tools/validation/mace_run.cc:454] input shape: 1,3,112,112
I mace/tools/validation/mace_run.cc:455] output node: fc1bn
I mace/tools/validation/mace_run.cc:456] output shape: 1,512,1,1
I mace/tools/validation/mace_run.cc:457] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/validation/mace_run.cc:458] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/validation/mace_run.cc:459] model_data_file: /data/local/tmp/mace_run/sp.data
I mace/tools/validation/mace_run.cc:460] model_file: /data/local/tmp/mace_run/sp.pb
I mace/tools/validation/mace_run.cc:461] device: CPU
I mace/tools/validation/mace_run.cc:462] round: 1
I mace/tools/validation/mace_run.cc:463] restart_round: 1
I mace/tools/validation/mace_run.cc:464] gpu_perf_hint: 3
I mace/tools/validation/mace_run.cc:465] gpu_priority_hint: 3
I mace/tools/validation/mace_run.cc:466] omp_num_threads: -1
I mace/tools/validation/mace_run.cc:467] cpu_affinity_policy: 1
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/libmace/mace.cc:603] Destroying MaceEngine
I mace/tools/validation/mace_run.cc:508] restart round 0
I mace/libmace/mace.cc:876] Create MaceEngine from model graph proto and weights data
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/tools/validation/mace_run.cc:265] Create Mace Engine latency: 15.579 ms
I mace/tools/validation/mace_run.cc:272] Total init latency: 15.693 ms
I mace/tools/validation/mace_run.cc:313] Warm up run
F mace/ops/depthwise_conv2d.cc:218] Check failed: filter->dim(2) == input->dim(3) 7 != 1024
F mace/ops/depthwise_conv2d.cc:218] backtrace:
F mace/ops/depthwise_conv2d.cc:218] pc 0xba6a08
F mace/ops/depthwise_conv2d.cc:218] pc 0xba6690
F mace/ops/depthwise_conv2d.cc:218] pc 0xbab428
F mace/ops/depthwise_conv2d.cc:218] pc 0xbab384
F mace/ops/depthwise_conv2d.cc:218] pc 0xbab6d4
F mace/ops/depthwise_conv2d.cc:218] pc 0xbab780
F mace/ops/depthwise_conv2d.cc:218] pc 0x9ab580
F mace/ops/depthwise_conv2d.cc:218] pc 0xb00708
F mace/ops/depthwise_conv2d.cc:218] pc 0x9388a4
F mace/ops/depthwise_conv2d.cc:218] pc 0x9394bc
F mace/ops/depthwise_conv2d.cc:218] pc 0x8f7368
F mace/ops/depthwise_conv2d.cc:218] pc 0x8fa570
F mace/ops/depthwise_conv2d.cc:218] pc 0x8faf30
F mace/ops/depthwise_conv2d.cc:218] pc 0xf26143a8 __libc_init
Aborted
ERROR: [Mace Run] /mace/tools/device.py:358: Mace run failed.
Might be that when target is arm64-v8a
this error isn't caught?
Moreover if I try to print the shapes of filter (1,1024,7,7) and input (1,7,7,1024), there's actually a mismatch. Is this a bug in depthwise conv? But filter shape is supposed to be (7,7,1024,1), am I right?
@gasgallo the command should be: adb logcat | $ANDROID_NDK_HOME/ndk-stack -sym builds/model/_tmp/arm64-v8a/
@gasgallo I don't think the error on arm64-v8a is the same as the armeabi-v7a. the filter's shape format should be OIHW, please reference: CPU runtime memory layout
@gasgallo It seems like a data layout issue. We have not tried to quantize a caffe model before. Maybe you can set input_data_formats
and output_data_formats
to NHWC
.
@gasgallo the command should be: adb logcat | $ANDROID_NDK_HOME/ndk-stack -sym builds/model/_tmp/arm64-v8a/
@lu229 Thanks, the following is the output of the command you suggested:
********** Crash dump: **********
Build fingerprint: 'Xiaomi/cepheus/cepheus:10/QKQ1.190825.002/V11.0.3.0.QFAMIXM:user/release-keys'
pid: 26491, tid: 26491, name: mace_run_static >>> /data/local/tmp/mace_run/mace_run_static <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x71f3c00000
Stack frame #00 pc 000000000006f880 /apex/com.android.runtime/lib64/bionic/libc.so (memset+256) (BuildId: 084c8a81b8c78e19cd9a1ff6208e77cf)
Stack frame #01 pc 000000000006a824 /data/local/tmp/mace_run/mace_run_static: Routine mace::Buffer::Clear(long) at ??:?
Stack frame #02 pc 00000000000c7cb8 /data/local/tmp/mace_run/mace_run_static: Routine mace::Tensor::Clear() at ??:?
Stack frame #03 pc 00000000000cbd48 /data/local/tmp/mace_run/mace_run_static: Routine mace::ops::DepthwiseConv2dOp<(mace::DeviceType)0, unsigned char>::Run(mace::OpContext*) at ??:?
Stack frame #04 pc 00000000001ebd08 /data/local/tmp/mace_run/mace_run_static: Routine mace::SerialNet::Run(mace::RunMetadata*) at ??:?
Stack frame #05 pc 00000000000677c0 /data/local/tmp/mace_run/mace_run_static: Routine mace::MaceEngine::Impl::Run(std::map<std::string, mace::MaceTensor, std::less<std::string>, std::allocator<std::pair<std::string const, mace::MaceTensor> > > const&, std::map<std::string, mace::MaceTensor, std::less<std::string>, std::allocator<std::pair<std::string const, mace::MaceTensor> > >*, mace::RunMetadata*) at ??:?
Stack frame #06 pc 00000000000682a4 /data/local/tmp/mace_run/mace_run_static: Routine mace::MaceEngine::Run(std::map<std::string, mace::MaceTensor, std::less<std::string>, std::allocator<std::pair<std::string const, mace::MaceTensor> > > const&, std::map<std::string, mace::MaceTensor, std::less<std::string>, std::allocator<std::pair<std::string const, mace::MaceTensor> > >*) at ??:?
Stack frame #07 pc 000000000002a5fc /data/local/tmp/mace_run/mace_run_static: Routine mace::tools::validation::RunModel(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, std::vector<std::vector<long, std::allocator<long> >, std::allocator<std::vector<long, std::allocator<long> > > > const&, std::vector<mace::DataFormat, std::allocator<mace::DataFormat> > const&, std::vector<std::string, std::allocator<std::string> > const&, std::vector<std::vector<long, std::allocator<long> >, std::allocator<std::vector<long, std::allocator<long> > > > const&, std::vector<mace::DataFormat, std::allocator<mace::DataFormat> > const&, float) at ??:?
Stack frame #08 pc 000000000002d264 /data/local/tmp/mace_run/mace_run_static: Routine mace::tools::validation::Main(int, char**) at ??:?
Stack frame #09 pc 000000000002da60 /data/local/tmp/mace_run/mace_run_static: Routine main at ??:?
Stack frame #10 pc 000000000006ebc4 /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+108) (BuildId: 084c8a81b8c78e19cd9a1ff6208e77cf)
line numbers are still missing but at least there's some more information.
@gasgallo as @lee-bin said, we haven't quantized a caffe model before, perhaps you can set input_data_formats and output_data_formats to NHWC and try again. If failed again, please attach your model and yml file, I will debug and find the error's cause.
@lu229 @lee-bin I've also tried, as you suggested, to play around with data layouts, but it doesn't seem to help. I get the same error with target armeabi-v7a
and same segmentation fault with target arm64-v8a
.
You can get my model and quantization stats from here.
The yaml
file in the first post will work fine.
Thank you!!
@gasgallo No permission to access the model file link.
@yejw5 try now
@gasgallo The overall_range.txt
downloaded doesn't contain message for op pre_fc1
? It crashed in convert stage:
Add quantize tensor range
File "tools/python/convert.py", line 279, in <module>
convert(conf, flags.output)
File "tools/python/convert.py", line 75, in convert
mace_model = convert_model(model_conf)
File "tools/python/convert.py", line 184, in convert_model
output_graph_def, quantize_activation_info = mace_transformer.run()
File "/data/deeplearning/framework/mace/tools/python/transform/transformer.py", line 139, in run
changed = transformer()
File "/data/deeplearning/framework/mace/tools/python/transform/transformer.py", line 1858, in add_quantize_tensor_range
% op)
File "/data/deeplearning/framework/mace/tools/python/utils/util.py", line 76, in mace_check
for line in traceback.format_stack():
ERROR: /data/deeplearning/framework/mace/tools/python/transform/transformer.py:1858: input: "conv_6dw7_7_conv2d"
input: "pre_fc1_filter"
input: "fc1bn_offset"
output: "pre_fc1"
name: "pre_fc1"
type: "Conv2D"
arg {
name: "T"
i: 1
}
arg {
name: "framework_type"
i: 1
}
arg {
name: "data_format"
i: 1000
}
arg {
name: "strides"
ints: 1
ints: 1
}
arg {
name: "padding_values"
ints: 0
ints: 0
}
output_shape {
dims: 1
dims: 1
dims: 1
dims: 512
}
does not have quantize activation info
@yejw5 sorry, use the following yaml
file.
# The name of library
library_name: model
target_abis: [arm64-v8a]
model_graph_format: file
model_data_format: file
models:
sp: # model tag, which will be used in model loading and must be specific.
platform: caffe
# path to your tensorflow model's pb file. Support local path, http:// and https://
model_file_path: /models/sp/model-nofc.prototxt
weight_file_path: /models/sp/model-nofc.caffemodel
# sha256_checksum of your model's pb file.
# use this command to get the sha256_checksum --> sha256sum path/to/your/pb/file
model_sha256_checksum: 54479f5ec821884f5bfcc03cb1f4558275541c6e80d9f33f65cc58562fffe91b
weight_sha256_checksum: e9599be0e9d5a5f08b85f9b98d2a76b55463ecb6820efc3bcdbc3ea0050f62a0
subgraphs:
- input_tensors:
- data
input_shapes:
- 1,3,112,112
input_data_formats:
- NCHW
output_tensors:
- fc1bn
output_shapes:
- 1,1,1,512
obfuscate: 0
quantize: 1
quantize_range_file: /mace/overall_range
runtime: cpu # cpu, gpu or cpu+gpu or dsp
winograd: 0
the output node is different and pre_fc1
is fused with fc1bn
.
@gasgallo It's caused by filter format of depthwise conv in caffe. You can use this patch to fix it (apply on newest master code):
diff --git a/tools/python/transform/transformer.py b/tools/python/transform/transformer.py
index 69411e4..b3df498 100644
--- a/tools/python/transform/transformer.py
+++ b/tools/python/transform/transformer.py
@@ -1116,6 +1116,17 @@ class Transformer(base_converter.ConverterInterface):
filter.float_data[:] = filter_data.flat
filter.dims[:] = filter_data.shape
transposed_filter.add(op.input[1])
+ elif ConverterUtil.get_arg(
+ op, MaceKeyword.mace_framework_type_str).i == \
+ FrameworkType.CAFFE.value and \
+ op.type == MaceOp.DepthwiseConv2d.name:
+ filter = self._consts[op.input[1]]
+ filter_data = np.array(filter.float_data).reshape(
+ filter.dims)
+ filter_data = filter_data.transpose(2, 3, 1, 0)
+ filter.float_data[:] = filter_data.flat
+ filter.dims[:] = filter_data.shape
+ transposed_filter.add(op.input[1])
# deconv's filter's output channel and input channel is reversed
for op in net.op:
if op.type == MaceOp.Deconv2D.name and \
@yejw5 It works, thank you!
Even though performance of quantized model are a bit disappointing. Seeing the benchmark of inception v3
model on pocophone here:
I was hoping for a similar result for my model, but performance are as follows:
@lee-bin @lu229 Any comment on above results?
@gasgallo Running your model on Mix2s (soc: sdm845) with 10 rounds got 65.522 avg ms. It has been smaller and faster than inception v3
. The speed gap maybe primarily caused by hardware. Or you can reduce the complexity of model graph.
Besides, lee-bin and lu229 are both on vocation. Maybe you can consult them after the Spring Festival.
@gasgallo Running your model on Mix2s (soc: sdm845) with 10 rounds got 65.522 avg ms. It has been smaller and faster than
inception v3
. The speed gap maybe primarily caused by hardware. Or you can reduce the complexity of model graph.
@yejw5 I'm not referring to the absolute inference time, but to the fact that there's no improvement using the quantized model on rk3399
and qcs605
. On those hardware, the quantized model runs at the same speed as the not-quantized model.
Sorry for misreading.
It may be some unknown reasons. We need analyze op by op...
@yejw5 here's the benchmark with op stats:
***************************************************
Benchmark model sp on msmnile
***************************************************
I mace/benchmark/benchmark_model.cc:202] Model name: [sp]
I mace/benchmark/benchmark_model.cc:203] Model_file:
I mace/benchmark/benchmark_model.cc:204] Device: [CPU]
I mace/benchmark/benchmark_model.cc:205] gpu_perf_hint: [3]
I mace/benchmark/benchmark_model.cc:206] gpu_priority_hint: [3]
I mace/benchmark/benchmark_model.cc:207] omp_num_threads: [-1]
I mace/benchmark/benchmark_model.cc:208] cpu_affinity_policy: [1]
I mace/benchmark/benchmark_model.cc:209] Input node: [data]
I mace/benchmark/benchmark_model.cc:210] Input shapes: [1,3,112,112]
I mace/benchmark/benchmark_model.cc:211] Output node: [fc1bn]
I mace/benchmark/benchmark_model.cc:212] output shapes: [1,1,1,512]
I mace/benchmark/benchmark_model.cc:213] Warmup runs: [1]
I mace/benchmark/benchmark_model.cc:214] Num runs: [100]
I mace/benchmark/benchmark_model.cc:215] Max run seconds: [10]
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ---------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Warm Up
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 1 | 412.012 | 412.012 | 412.012 | 412.012 | 412.012 | 0.000 |
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run without statistics
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 83 | 134.143 | 121.375 | 120.978 | 134.143 | 121.844 | 1396.135 |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] -----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run with statistics
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 82 | 122.892 | 122.783 | 121.927 | 124.547 | 122.641 | 368.202 |
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Run Order
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 0.000 | 1.158 | 1.127 | 0.926 | 0.926 | 19.232 | [1,1] | [2,2] | [64,3,3,3] | [1,64,112,112] | | relu0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 1.149 | 11.004 | 10.822 | 8.890 | 9.816 | 10.682 | [2,2] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 12.001 | 3.026 | 2.936 | 2.412 | 12.228 | 39.380 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 14.964 | 9.724 | 9.784 | 8.037 | 20.265 | 1.313 | [2,2] | [0,0] | [64,64,1,1] | [1,64,56,56] | | stage1_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 24.773 | 0.069 | 0.070 | 0.058 | 20.322 | 0.000 | | | | [1,64,56,56] | | _plus0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 24.846 | 3.005 | 3.071 | 2.523 | 22.845 | 37.642 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 27.943 | 2.982 | 3.038 | 2.496 | 25.341 | 38.049 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 31.005 | 0.070 | 0.071 | 0.059 | 25.400 | 0.000 | | | | [1,64,56,56] | | _plus1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 31.079 | 3.112 | 3.037 | 2.495 | 27.895 | 38.066 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 34.154 | 2.988 | 3.015 | 2.477 | 30.372 | 38.340 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 37.194 | 0.070 | 0.072 | 0.059 | 30.431 | 0.000 | | | | [1,64,56,56] | | _plus2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 37.268 | 4.447 | 4.481 | 3.681 | 34.112 | 12.901 | [2,2] | [2,2] | [128,64,3,3] | [1,128,28,28] | | stage2_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 41.767 | 2.309 | 2.254 | 1.852 | 35.963 | 51.291 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 44.043 | 4.901 | 4.939 | 4.057 | 40.021 | 1.300 | [2,2] | [0,0] | [128,64,1,1] | [1,128,28,28] | | stage2_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 48.997 | 0.022 | 0.023 | 0.019 | 40.040 | 0.000 | | | | [1,128,28,28] | | _plus3 |
I mace/benchmark/statistics.cc:347] | Conv2D | 49.022 | 2.241 | 2.257 | 1.854 | 41.893 | 51.229 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 51.302 | 2.319 | 2.222 | 1.825 | 43.718 | 52.036 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 53.547 | 0.038 | 0.038 | 0.031 | 43.750 | 0.000 | | | | [1,128,28,28] | | _plus4 |
I mace/benchmark/statistics.cc:347] | Conv2D | 53.587 | 2.214 | 2.223 | 1.826 | 45.576 | 52.015 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 55.833 | 2.208 | 2.209 | 1.815 | 47.390 | 52.334 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 58.064 | 0.037 | 0.038 | 0.032 | 47.422 | 0.000 | | | | [1,128,28,28] | | _plus5 |
I mace/benchmark/statistics.cc:347] | Conv2D | 58.104 | 2.223 | 2.215 | 1.820 | 49.241 | 52.193 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 60.341 | 2.296 | 2.205 | 1.811 | 51.053 | 52.432 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 62.568 | 0.040 | 0.037 | 0.031 | 51.083 | 0.000 | | | | [1,128,28,28] | | _plus6 |
I mace/benchmark/statistics.cc:347] | Conv2D | 62.607 | 5.095 | 5.112 | 4.200 | 55.283 | 11.306 | [2,2] | [2,2] | [256,128,3,3] | [1,256,14,14] | | stage3_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 67.737 | 1.953 | 1.968 | 1.617 | 56.900 | 58.734 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 69.726 | 5.903 | 5.872 | 4.824 | 61.724 | 1.094 | [2,2] | [0,0] | [256,128,1,1] | [1,256,14,14] | | stage3_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 75.615 | 0.016 | 0.017 | 0.014 | 61.738 | 0.000 | | | | [1,256,14,14] | | _plus7 |
I mace/benchmark/statistics.cc:347] | Conv2D | 75.633 | 1.949 | 1.958 | 1.608 | 63.346 | 59.045 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 77.612 | 1.934 | 1.946 | 1.599 | 64.945 | 59.396 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 79.580 | 0.020 | 0.022 | 0.018 | 64.963 | 0.000 | | | | [1,256,14,14] | | _plus8 |
I mace/benchmark/statistics.cc:347] | Conv2D | 79.603 | 1.936 | 1.951 | 1.603 | 66.566 | 59.244 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 81.575 | 1.982 | 1.942 | 1.595 | 68.161 | 59.536 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 83.541 | 0.024 | 0.022 | 0.018 | 68.179 | 0.000 | | | | [1,256,14,14] | | _plus9 |
I mace/benchmark/statistics.cc:347] | Conv2D | 83.565 | 1.946 | 1.956 | 1.607 | 69.786 | 59.100 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 85.542 | 1.938 | 1.947 | 1.600 | 71.385 | 59.364 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 87.510 | 0.020 | 0.021 | 0.018 | 71.403 | 0.000 | | | | [1,256,14,14] | | _plus10 |
I mace/benchmark/statistics.cc:347] | Conv2D | 87.534 | 1.930 | 1.949 | 1.601 | 73.004 | 59.323 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit5_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 89.504 | 1.919 | 1.945 | 1.598 | 74.602 | 59.427 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit5_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 91.469 | 0.022 | 0.021 | 0.018 | 74.619 | 0.000 | | | | [1,256,14,14] | | _plus11 |
I mace/benchmark/statistics.cc:347] | Conv2D | 91.492 | 2.017 | 1.960 | 1.610 | 76.230 | 58.979 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit6_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 93.472 | 1.938 | 1.954 | 1.605 | 77.835 | 59.161 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit6_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 95.448 | 0.022 | 0.021 | 0.018 | 77.852 | 0.000 | | | | [1,256,14,14] | | _plus12 |
I mace/benchmark/statistics.cc:347] | Conv2D | 95.471 | 5.204 | 5.259 | 4.320 | 82.172 | 10.992 | [2,2] | [2,2] | [512,256,3,3] | [1,512,7,7] | | stage4_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 100.750 | 3.136 | 3.120 | 2.563 | 84.736 | 37.049 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 103.893 | 5.230 | 5.260 | 4.321 | 89.057 | 1.221 | [2,2] | [0,0] | [512,256,1,1] | [1,512,7,7] | | stage4_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 109.169 | 0.010 | 0.011 | 0.009 | 89.066 | 0.000 | | | | [1,512,7,7] | | _plus13 |
I mace/benchmark/statistics.cc:347] | Conv2D | 109.182 | 3.136 | 3.107 | 2.553 | 91.618 | 37.203 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 112.312 | 3.068 | 3.075 | 2.526 | 94.144 | 37.597 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 115.410 | 0.015 | 0.015 | 0.012 | 94.156 | 0.000 | | | | [1,512,7,7] | | _plus14 |
I mace/benchmark/statistics.cc:347] | Conv2D | 115.426 | 3.065 | 3.090 | 2.538 | 96.695 | 37.414 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 118.539 | 3.041 | 3.066 | 2.519 | 99.213 | 37.702 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 121.627 | 0.013 | 0.014 | 0.012 | 99.225 | 0.000 | | | | [1,512,7,7] | | _plus15 |
I mace/benchmark/statistics.cc:347] | Conv2D | 121.643 | 0.775 | 0.752 | 0.618 | 99.843 | 34.141 | [1,1] | [0,0] | [1024,512,1,1] | [1,1024,7,7] | | convx1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 122.410 | 0.084 | 0.084 | 0.069 | 99.912 | 0.600 | [1,1] | [0,0] | [1,1024,7,7] | [1,1024,1,1] | | conv_6dw7_7_conv2d |
I mace/benchmark/statistics.cc:347] | FullyConnected | 122.503 | 0.107 | 0.107 | 0.088 | 100.000 | 4.895 | | | [512,1024,1,1] | [1,512,1,1] | | pre_fc1 |
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Computation Time
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 1.149 | 11.004 | 10.822 | 8.890 | 8.890 | 10.682 | [2,2] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 14.964 | 9.724 | 9.784 | 8.037 | 16.927 | 1.313 | [2,2] | [0,0] | [64,64,1,1] | [1,64,56,56] | | stage1_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Conv2D | 69.726 | 5.903 | 5.872 | 4.824 | 21.751 | 1.094 | [2,2] | [0,0] | [256,128,1,1] | [1,256,14,14] | | stage3_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Conv2D | 103.893 | 5.230 | 5.260 | 4.321 | 26.073 | 1.221 | [2,2] | [0,0] | [512,256,1,1] | [1,512,7,7] | | stage4_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Conv2D | 95.471 | 5.204 | 5.259 | 4.320 | 30.393 | 10.992 | [2,2] | [2,2] | [512,256,3,3] | [1,512,7,7] | | stage4_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 62.607 | 5.095 | 5.112 | 4.200 | 34.592 | 11.306 | [2,2] | [2,2] | [256,128,3,3] | [1,256,14,14] | | stage3_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 44.043 | 4.901 | 4.939 | 4.057 | 38.649 | 1.300 | [2,2] | [0,0] | [128,64,1,1] | [1,128,28,28] | | stage2_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Conv2D | 37.268 | 4.447 | 4.481 | 3.681 | 42.330 | 12.901 | [2,2] | [2,2] | [128,64,3,3] | [1,128,28,28] | | stage2_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 100.750 | 3.136 | 3.120 | 2.563 | 44.893 | 37.049 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 109.182 | 3.136 | 3.107 | 2.553 | 47.446 | 37.203 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu1 |
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by Op Type
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Count | Avg(ms) | % | cdf% | MACs | GMACPS | Called times |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 38 | 121.007 | 99.426 | 99.426 | 3,605,446,656 | 29.795 | 38 |
I mace/benchmark/statistics.cc:347] | Eltwise | 16 | 0.508 | 0.417 | 99.844 | 0 | 0.000 | 16 |
I mace/benchmark/statistics.cc:347] | FullyConnected | 1 | 0.107 | 0.088 | 99.932 | 524,288 | 4.900 | 1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 1 | 0.083 | 0.068 | 100.000 | 50,176 | 0.605 | 1 |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] -----------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by MACs(Multiply-Accumulation)
I mace/benchmark/statistics.cc:347] -----------------------------------------------------------
I mace/benchmark/statistics.cc:347] | total | round | first(G/s) | avg(G/s) | std |
I mace/benchmark/statistics.cc:347] -----------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 3,606,021,120 | 82 | 29.569 | 29.623 | 365.000 |
I mace/benchmark/statistics.cc:347] -----------------------------------------------------------
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Summary of Ops' Stat
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 82 | 121.951 | 121.880 | 121.017 | 123.636 | 121.731 | 365.000 |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] 56 ops total.
I mace/libmace/mace.cc:603] Destroying MaceEngine
*************************************************************
Benchmark model sp on msmnile
*************************************************************
I mace/benchmark/benchmark_model.cc:202] Model name: [sp]
I mace/benchmark/benchmark_model.cc:203] Model_file:
I mace/benchmark/benchmark_model.cc:204] Device: [CPU]
I mace/benchmark/benchmark_model.cc:205] gpu_perf_hint: [3]
I mace/benchmark/benchmark_model.cc:206] gpu_priority_hint: [3]
I mace/benchmark/benchmark_model.cc:207] omp_num_threads: [-1]
I mace/benchmark/benchmark_model.cc:208] cpu_affinity_policy: [1]
I mace/benchmark/benchmark_model.cc:209] Input node: [data]
I mace/benchmark/benchmark_model.cc:210] Input shapes: [1,3,112,112]
I mace/benchmark/benchmark_model.cc:211] Output node: [fc1bn]
I mace/benchmark/benchmark_model.cc:212] output shapes: [1,1,1,512]
I mace/benchmark/benchmark_model.cc:213] Warmup runs: [1]
I mace/benchmark/benchmark_model.cc:214] Num runs: [100]
I mace/benchmark/benchmark_model.cc:215] Max run seconds: [10]
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ---------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Warm Up
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 1 | 278.372 | 278.372 | 278.372 | 278.372 | 278.372 | 0.000 |
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run without statistics
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 44 | 226.746 | 226.512 | 225.684 | 258.229 | 229.562 | 6732.804 |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run with statistics
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 44 | 229.383 | 227.496 | 226.548 | 259.413 | 230.770 | 6311.498 |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Run Order
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Quantize | 0.000 | 0.076 | 0.076 | 0.033 | 0.033 | 0.000 | | | | [1,112,112,3] | | mace_input_node_data |
I mace/benchmark/statistics.cc:347] | Conv2D | 0.079 | 3.898 | 3.954 | 1.725 | 1.758 | 5.482 | [1,1] | [2,2] | [64,3,3,3] | [1,112,112,64] | | relu0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 4.062 | 6.055 | 6.192 | 2.701 | 4.459 | 18.671 | [2,2] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 10.290 | 6.188 | 6.035 | 2.632 | 7.091 | 19.155 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 16.358 | 1.430 | 1.478 | 0.645 | 7.736 | 8.689 | [2,2] | [0,0] | [64,1,1,64] | [1,56,56,64] | | stage1_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 17.858 | 0.219 | 0.222 | 0.097 | 7.833 | 0.000 | | | | [1,56,56,64] | | _plus0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 18.081 | 5.938 | 6.044 | 2.636 | 10.469 | 19.128 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 24.156 | 5.896 | 6.017 | 2.624 | 13.093 | 19.214 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 30.202 | 0.296 | 0.296 | 0.129 | 13.222 | 0.000 | | | | [1,56,56,64] | | _plus1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 30.504 | 5.596 | 6.054 | 2.640 | 15.863 | 19.097 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 36.587 | 5.934 | 6.018 | 2.625 | 18.488 | 19.209 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 42.634 | 0.294 | 0.342 | 0.149 | 18.637 | 0.000 | | | | [1,56,56,64] | | _plus2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 42.979 | 3.519 | 3.937 | 1.717 | 20.354 | 14.683 | [2,2] | [2,2] | [128,3,3,64] | [1,28,28,128] | | stage2_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 46.943 | 6.408 | 5.864 | 2.558 | 22.912 | 19.713 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 52.837 | 0.729 | 0.881 | 0.384 | 23.296 | 7.290 | [2,2] | [0,0] | [128,1,1,64] | [1,28,28,128] | | stage2_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 53.736 | 0.111 | 0.159 | 0.069 | 23.365 | 0.000 | | | | [1,28,28,128] | | _plus3 |
I mace/benchmark/statistics.cc:347] | Conv2D | 53.897 | 5.720 | 5.819 | 2.538 | 25.903 | 19.868 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 59.749 | 5.912 | 5.955 | 2.597 | 28.501 | 19.413 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 65.735 | 0.192 | 0.291 | 0.127 | 28.628 | 0.000 | | | | [1,28,28,128] | | _plus4 |
I mace/benchmark/statistics.cc:347] | Conv2D | 66.030 | 5.384 | 5.834 | 2.545 | 31.172 | 19.816 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 71.892 | 5.693 | 5.748 | 2.507 | 33.679 | 20.113 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 77.669 | 0.188 | 0.317 | 0.138 | 33.817 | 0.000 | | | | [1,28,28,128] | | _plus5 |
I mace/benchmark/statistics.cc:347] | Conv2D | 77.989 | 5.379 | 5.921 | 2.583 | 36.400 | 19.524 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 83.940 | 6.380 | 5.782 | 2.522 | 38.922 | 19.993 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 89.752 | 0.189 | 0.235 | 0.103 | 39.025 | 0.000 | | | | [1,28,28,128] | | _plus6 |
I mace/benchmark/statistics.cc:347] | Conv2D | 89.991 | 4.016 | 4.114 | 1.794 | 40.819 | 14.050 | [2,2] | [2,2] | [256,3,3,128] | [1,14,14,256] | | stage3_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 94.133 | 8.242 | 8.423 | 3.674 | 44.493 | 13.725 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 102.587 | 0.609 | 0.640 | 0.279 | 44.772 | 10.033 | [2,2] | [0,0] | [256,1,1,128] | [1,14,14,256] | | stage3_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 103.246 | 0.059 | 0.059 | 0.026 | 44.798 | 0.000 | | | | [1,14,14,256] | | _plus7 |
I mace/benchmark/statistics.cc:347] | Conv2D | 103.308 | 7.983 | 8.309 | 3.624 | 48.423 | 13.913 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 111.652 | 8.990 | 8.313 | 3.626 | 52.049 | 13.907 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 119.997 | 0.132 | 0.140 | 0.061 | 52.110 | 0.000 | | | | [1,14,14,256] | | _plus8 |
I mace/benchmark/statistics.cc:347] | Conv2D | 120.139 | 8.252 | 8.339 | 3.637 | 55.747 | 13.863 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 128.517 | 8.015 | 8.225 | 3.588 | 59.335 | 14.055 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 136.774 | 0.129 | 0.130 | 0.057 | 59.392 | 0.000 | | | | [1,14,14,256] | | _plus9 |
I mace/benchmark/statistics.cc:347] | Conv2D | 136.907 | 8.524 | 8.291 | 3.616 | 63.008 | 13.943 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 145.232 | 8.274 | 8.322 | 3.630 | 66.638 | 13.892 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 153.585 | 0.130 | 0.132 | 0.058 | 66.695 | 0.000 | | | | [1,14,14,256] | | _plus10 |
I mace/benchmark/statistics.cc:347] | Conv2D | 153.719 | 7.997 | 8.253 | 3.600 | 70.295 | 14.007 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 162.005 | 9.002 | 8.302 | 3.621 | 73.916 | 13.925 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 170.340 | 0.133 | 0.138 | 0.060 | 73.976 | 0.000 | | | | [1,14,14,256] | | _plus11 |
I mace/benchmark/statistics.cc:347] | Conv2D | 170.481 | 8.729 | 8.363 | 3.648 | 77.624 | 13.823 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 178.878 | 8.047 | 8.326 | 3.632 | 81.256 | 13.885 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 187.236 | 0.127 | 0.129 | 0.056 | 81.312 | 0.000 | | | | [1,14,14,256] | | _plus12 |
I mace/benchmark/statistics.cc:347] | Conv2D | 187.369 | 4.416 | 4.222 | 1.841 | 83.154 | 13.692 | [2,2] | [2,2] | [512,3,3,256] | [1,7,7,512] | | stage4_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 191.623 | 6.957 | 7.064 | 3.081 | 86.235 | 16.365 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 198.725 | 0.609 | 1.006 | 0.439 | 86.674 | 6.382 | [2,2] | [0,0] | [512,1,1,256] | [1,7,7,512] | | stage4_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 199.750 | 0.034 | 0.035 | 0.015 | 86.689 | 0.000 | | | | [1,7,7,512] | | _plus13 |
I mace/benchmark/statistics.cc:347] | Conv2D | 199.788 | 7.607 | 6.986 | 3.047 | 89.736 | 16.549 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 206.806 | 6.738 | 7.049 | 3.075 | 92.811 | 16.400 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 213.890 | 0.061 | 0.091 | 0.040 | 92.850 | 0.000 | | | | [1,7,7,512] | | _plus14 |
I mace/benchmark/statistics.cc:347] | Conv2D | 213.984 | 6.928 | 6.941 | 3.028 | 95.878 | 16.655 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 220.962 | 6.627 | 6.975 | 3.042 | 98.920 | 16.575 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 227.975 | 0.287 | 0.068 | 0.030 | 98.950 | 0.000 | | | | [1,7,7,512] | | _plus15 |
I mace/benchmark/statistics.cc:347] | Conv2D | 228.047 | 2.047 | 2.022 | 0.882 | 99.832 | 12.704 | [1,1] | [0,0] | [1024,1,1,512] | [1,7,7,1024] | | convx1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 230.086 | 0.099 | 0.038 | 0.017 | 99.849 | 192.125 | [1,1] | [0,0] | [7,7,1024,1] | [1,1,1,1024] | | conv_6dw7_7_conv2d |
I mace/benchmark/statistics.cc:347] | Conv2D | 230.137 | 0.348 | 0.342 | 0.149 | 99.998 | 1.535 | [1,1] | [0,0] | [512,1,1,1024] | [1,1,1,512] | | mace_output_node_pre_fc1 |
I mace/benchmark/statistics.cc:347] | Dequantize | 230.492 | 0.006 | 0.005 | 0.002 | 100.000 | 0.000 | | | | [1,1,1,512] | | fc1bn |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Computation Time
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 94.133 | 8.242 | 8.423 | 3.674 | 3.674 | 13.725 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 170.481 | 8.729 | 8.363 | 3.648 | 7.322 | 13.823 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 120.139 | 8.252 | 8.339 | 3.637 | 10.959 | 13.863 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 178.878 | 8.047 | 8.326 | 3.632 | 14.591 | 13.885 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 145.232 | 8.274 | 8.322 | 3.630 | 18.220 | 13.892 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 111.652 | 8.990 | 8.313 | 3.626 | 21.846 | 13.907 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 103.308 | 7.983 | 8.309 | 3.624 | 25.471 | 13.913 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 162.005 | 9.002 | 8.302 | 3.621 | 29.092 | 13.925 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 136.907 | 8.524 | 8.291 | 3.616 | 32.708 | 13.943 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 153.719 | 7.997 | 8.253 | 3.600 | 36.308 | 14.007 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu1 |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by Op Type
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Count | Avg(ms) | % | cdf% | MACs | GMACPS | Called times |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 39 | 226.340 | 98.737 | 98.737 | 3,605,970,944 | 15.932 | 39 |
I mace/benchmark/statistics.cc:347] | Eltwise | 16 | 2.776 | 1.211 | 99.948 | 0 | 0.000 | 16 |
I mace/benchmark/statistics.cc:347] | Quantize | 1 | 0.076 | 0.033 | 99.981 | 0 | 0.000 | 1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 1 | 0.038 | 0.017 | 99.998 | 7,340,032 | 193.159 | 1 |
I mace/benchmark/statistics.cc:347] | Dequantize | 1 | 0.005 | 0.002 | 100.000 | 0 | 0.000 | 1 |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by MACs(Multiply-Accumulation)
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | total | round | first(G/s) | avg(G/s) | std |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 3,613,310,976 | 44 | 15.863 | 15.760 | 6269.283 |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Summary of Ops' Stat
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 44 | 227.778 | 226.001 | 225.092 | 257.680 | 229.264 | 6269.283 |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] 58 ops total.
I mace/libmace/mace.cc:603] Destroying MaceEngine
*************************************************************
Benchmark model sp on qcs605
*************************************************************
I mace/benchmark/benchmark_model.cc:202] Model name: [sp]
I mace/benchmark/benchmark_model.cc:203] Model_file:
I mace/benchmark/benchmark_model.cc:204] Device: [CPU]
I mace/benchmark/benchmark_model.cc:205] gpu_perf_hint: [3]
I mace/benchmark/benchmark_model.cc:206] gpu_priority_hint: [3]
I mace/benchmark/benchmark_model.cc:207] omp_num_threads: [-1]
I mace/benchmark/benchmark_model.cc:208] cpu_affinity_policy: [1]
I mace/benchmark/benchmark_model.cc:209] Input node: [data]
I mace/benchmark/benchmark_model.cc:210] Input shapes: [1,3,112,112]
I mace/benchmark/benchmark_model.cc:211] Output node: [fc1bn]
I mace/benchmark/benchmark_model.cc:212] output shapes: [1,1,1,512]
I mace/benchmark/benchmark_model.cc:213] Warmup runs: [1]
I mace/benchmark/benchmark_model.cc:214] Num runs: [100]
I mace/benchmark/benchmark_model.cc:215] Max run seconds: [10]
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Warm Up
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 1 | 1703.949 | 1703.949 | 1703.949 | 1703.949 | 1703.949 | 0.000 |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run without statistics
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 30 | 347.233 | 355.134 | 324.497 | 357.980 | 343.235 | 8428.071 |
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run with statistics
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 30 | 339.917 | 341.660 | 317.651 | 367.916 | 343.331 | 11751.904 |
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Run Order
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 0.000 | 8.727 | 4.612 | 1.352 | 1.352 | 4.699 | [1,1] | [2,2] | [64,3,3,3] | [1,64,112,112] | | relu0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 4.668 | 28.125 | 28.545 | 8.370 | 9.722 | 4.050 | [2,2] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 33.288 | 6.931 | 7.398 | 2.169 | 11.891 | 15.628 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 40.740 | 23.381 | 24.941 | 7.313 | 19.205 | 0.515 | [2,2] | [0,0] | [64,64,1,1] | [1,64,56,56] | | stage1_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 65.743 | 0.424 | 0.318 | 0.093 | 19.298 | 0.000 | | | | [1,64,56,56] | | _plus0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 66.075 | 7.793 | 7.790 | 2.284 | 21.582 | 14.840 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 73.916 | 7.172 | 6.874 | 2.016 | 23.598 | 16.818 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 80.840 | 0.224 | 0.183 | 0.054 | 23.651 | 0.000 | | | | [1,64,56,56] | | _plus1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 81.033 | 7.364 | 7.018 | 2.058 | 25.709 | 16.473 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 88.148 | 6.447 | 6.703 | 1.966 | 27.675 | 17.246 | [1,1] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 94.901 | 0.167 | 0.180 | 0.053 | 27.728 | 0.000 | | | | [1,64,56,56] | | _plus2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 95.090 | 10.986 | 13.771 | 4.038 | 31.765 | 4.198 | [2,2] | [2,2] | [128,64,3,3] | [1,128,28,28] | | stage2_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 108.919 | 7.270 | 6.409 | 1.879 | 33.644 | 18.039 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 115.380 | 8.906 | 8.910 | 2.612 | 36.257 | 0.721 | [2,2] | [0,0] | [128,64,1,1] | [1,128,28,28] | | stage2_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 124.341 | 0.064 | 0.079 | 0.023 | 36.280 | 0.000 | | | | [1,128,28,28] | | _plus3 |
I mace/benchmark/statistics.cc:347] | Conv2D | 124.427 | 6.015 | 6.337 | 1.858 | 38.138 | 18.244 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 130.814 | 5.542 | 6.047 | 1.773 | 39.911 | 19.117 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 136.921 | 0.102 | 0.211 | 0.062 | 39.973 | 0.000 | | | | [1,128,28,28] | | _plus4 |
I mace/benchmark/statistics.cc:347] | Conv2D | 137.140 | 7.359 | 6.218 | 1.823 | 41.797 | 18.592 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 143.413 | 5.962 | 5.726 | 1.679 | 43.475 | 20.190 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 149.207 | 0.104 | 0.081 | 0.024 | 43.499 | 0.000 | | | | [1,128,28,28] | | _plus5 |
I mace/benchmark/statistics.cc:347] | Conv2D | 149.297 | 5.545 | 5.756 | 1.688 | 45.187 | 20.084 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 155.106 | 5.666 | 5.727 | 1.679 | 46.866 | 20.185 | [1,1] | [2,2] | [128,128,3,3] | [1,128,28,28] | | stage2_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 160.885 | 0.075 | 0.075 | 0.022 | 46.888 | 0.000 | | | | [1,128,28,28] | | _plus6 |
I mace/benchmark/statistics.cc:347] | Conv2D | 160.966 | 12.732 | 14.461 | 4.240 | 51.129 | 3.997 | [2,2] | [2,2] | [256,128,3,3] | [1,256,14,14] | | stage3_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 175.483 | 6.434 | 6.069 | 1.780 | 52.908 | 19.049 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 181.599 | 10.298 | 9.710 | 2.847 | 55.755 | 0.661 | [2,2] | [0,0] | [256,128,1,1] | [1,256,14,14] | | stage3_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 191.361 | 0.032 | 0.044 | 0.013 | 55.768 | 0.000 | | | | [1,256,14,14] | | _plus7 |
I mace/benchmark/statistics.cc:347] | Conv2D | 191.420 | 5.259 | 6.238 | 1.829 | 57.597 | 18.533 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 197.705 | 6.891 | 6.286 | 1.843 | 59.440 | 18.392 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 204.041 | 0.036 | 0.040 | 0.012 | 59.452 | 0.000 | | | | [1,256,14,14] | | _plus8 |
I mace/benchmark/statistics.cc:347] | Conv2D | 204.086 | 5.310 | 5.837 | 1.712 | 61.163 | 19.805 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 209.976 | 5.416 | 5.466 | 1.603 | 62.766 | 21.148 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 215.502 | 0.033 | 0.041 | 0.012 | 62.778 | 0.000 | | | | [1,256,14,14] | | _plus9 |
I mace/benchmark/statistics.cc:347] | Conv2D | 215.546 | 5.582 | 5.689 | 1.668 | 64.446 | 20.322 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 221.286 | 5.303 | 5.489 | 1.609 | 66.056 | 21.063 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 226.825 | 0.034 | 0.039 | 0.012 | 66.067 | 0.000 | | | | [1,256,14,14] | | _plus10 |
I mace/benchmark/statistics.cc:347] | Conv2D | 226.868 | 5.742 | 5.648 | 1.656 | 67.723 | 20.467 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit5_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 232.565 | 5.271 | 5.529 | 1.621 | 69.345 | 20.909 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit5_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 238.148 | 0.078 | 0.041 | 0.012 | 69.357 | 0.000 | | | | [1,256,14,14] | | _plus11 |
I mace/benchmark/statistics.cc:347] | Conv2D | 238.193 | 5.908 | 5.651 | 1.657 | 71.014 | 20.457 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit6_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 243.893 | 5.393 | 5.758 | 1.689 | 72.702 | 20.076 | [1,1] | [2,2] | [256,256,3,3] | [1,256,14,14] | | stage3_unit6_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 249.705 | 0.035 | 0.039 | 0.011 | 72.714 | 0.000 | | | | [1,256,14,14] | | _plus12 |
I mace/benchmark/statistics.cc:347] | Conv2D | 249.749 | 16.792 | 15.521 | 4.551 | 77.265 | 3.724 | [2,2] | [2,2] | [512,256,3,3] | [1,512,7,7] | | stage4_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 265.341 | 12.329 | 13.139 | 3.853 | 81.117 | 8.798 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 278.545 | 13.807 | 11.460 | 3.360 | 84.478 | 0.560 | [2,2] | [0,0] | [512,256,1,1] | [1,512,7,7] | | stage4_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 290.051 | 0.037 | 0.095 | 0.028 | 84.505 | 0.000 | | | | [1,512,7,7] | | _plus13 |
I mace/benchmark/statistics.cc:347] | Conv2D | 290.149 | 12.908 | 14.068 | 4.125 | 88.630 | 8.218 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 304.266 | 10.369 | 11.983 | 3.514 | 92.144 | 9.648 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 316.298 | 0.027 | 0.025 | 0.007 | 92.151 | 0.000 | | | | [1,512,7,7] | | _plus14 |
I mace/benchmark/statistics.cc:347] | Conv2D | 316.326 | 10.983 | 11.967 | 3.509 | 95.660 | 9.660 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 328.342 | 10.469 | 11.826 | 3.468 | 99.128 | 9.776 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 340.217 | 0.024 | 0.030 | 0.009 | 99.137 | 0.000 | | | | [1,512,7,7] | | _plus15 |
I mace/benchmark/statistics.cc:347] | Conv2D | 340.250 | 2.579 | 2.572 | 0.754 | 99.891 | 9.987 | [1,1] | [0,0] | [1024,512,1,1] | [1,1024,7,7] | | convx1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 342.867 | 0.133 | 0.139 | 0.041 | 99.932 | 0.361 | [1,1] | [0,0] | [1,1024,7,7] | [1,1024,1,1] | | conv_6dw7_7_conv2d |
I mace/benchmark/statistics.cc:347] | FullyConnected | 343.038 | 0.340 | 0.233 | 0.068 | 100.000 | 2.255 | | | [512,1024,1,1] | [1,512,1,1] | | pre_fc1 |
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Computation Time
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 4.668 | 28.125 | 28.545 | 8.370 | 8.370 | 4.050 | [2,2] | [2,2] | [64,64,3,3] | [1,64,56,56] | | stage1_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 40.740 | 23.381 | 24.941 | 7.313 | 15.683 | 0.515 | [2,2] | [0,0] | [64,64,1,1] | [1,64,56,56] | | stage1_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Conv2D | 249.749 | 16.792 | 15.521 | 4.551 | 20.234 | 3.724 | [2,2] | [2,2] | [512,256,3,3] | [1,512,7,7] | | stage4_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 160.966 | 12.732 | 14.461 | 4.240 | 24.474 | 3.997 | [2,2] | [2,2] | [256,128,3,3] | [1,256,14,14] | | stage3_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 290.149 | 12.908 | 14.068 | 4.125 | 28.599 | 8.218 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 95.090 | 10.986 | 13.771 | 4.038 | 32.637 | 4.198 | [2,2] | [2,2] | [128,64,3,3] | [1,128,28,28] | | stage2_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 265.341 | 12.329 | 13.139 | 3.853 | 36.490 | 8.798 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 304.266 | 10.369 | 11.983 | 3.514 | 40.003 | 9.648 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 316.326 | 10.983 | 11.967 | 3.509 | 43.512 | 9.660 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 328.342 | 10.469 | 11.826 | 3.468 | 46.980 | 9.776 | [1,1] | [2,2] | [512,512,3,3] | [1,512,7,7] | | stage4_unit3_relu2 |
I mace/benchmark/statistics.cc:347] ----------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by Op Type
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Count | Avg(ms) | % | cdf% | MACs | GMACPS | Called times |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 38 | 339.130 | 99.448 | 99.448 | 3,605,446,656 | 10.631 | 38 |
I mace/benchmark/statistics.cc:347] | Eltwise | 16 | 1.514 | 0.444 | 99.892 | 0 | 0.000 | 16 |
I mace/benchmark/statistics.cc:347] | FullyConnected | 1 | 0.232 | 0.068 | 99.960 | 524,288 | 2.260 | 1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 1 | 0.138 | 0.040 | 100.000 | 50,176 | 0.364 | 1 |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by MACs(Multiply-Accumulation)
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | total | round | first(G/s) | avg(G/s) | std |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 3,606,021,120 | 30 | 10.702 | 10.574 | 11713.787 |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Summary of Ops' Stat
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 30 | 336.935 | 339.407 | 315.588 | 365.599 | 341.042 | 11713.787 |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] 56 ops total.
I mace/libmace/mace.cc:603] Destroying MaceEngine
*************************************************************
Benchmark model sp on qcs605
*************************************************************
I mace/benchmark/benchmark_model.cc:202] Model name: [sp]
I mace/benchmark/benchmark_model.cc:203] Model_file:
I mace/benchmark/benchmark_model.cc:204] Device: [CPU]
I mace/benchmark/benchmark_model.cc:205] gpu_perf_hint: [3]
I mace/benchmark/benchmark_model.cc:206] gpu_priority_hint: [3]
I mace/benchmark/benchmark_model.cc:207] omp_num_threads: [-1]
I mace/benchmark/benchmark_model.cc:208] cpu_affinity_policy: [1]
I mace/benchmark/benchmark_model.cc:209] Input node: [data]
I mace/benchmark/benchmark_model.cc:210] Input shapes: [1,3,112,112]
I mace/benchmark/benchmark_model.cc:211] Output node: [fc1bn]
I mace/benchmark/benchmark_model.cc:212] output shapes: [1,1,1,512]
I mace/benchmark/benchmark_model.cc:213] Warmup runs: [1]
I mace/benchmark/benchmark_model.cc:214] Num runs: [100]
I mace/benchmark/benchmark_model.cc:215] Max run seconds: [10]
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] ---------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Warm Up
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 1 | 353.460 | 353.460 | 353.460 | 353.460 | 353.460 | 0.000 |
I mace/benchmark/benchmark_model.cc:155] ----------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run without statistics
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 30 | 360.735 | 339.396 | 294.324 | 362.079 | 336.535 | 15204.868 |
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155]
I mace/benchmark/benchmark_model.cc:155] -------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] Run with statistics
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/benchmark_model.cc:155] | 30 | 364.769 | 330.680 | 309.523 | 364.769 | 338.014 | 13015.378 |
I mace/benchmark/benchmark_model.cc:155] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Run Order
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Quantize | 0.000 | 0.121 | 0.165 | 0.049 | 0.049 | 0.000 | | | | [1,112,112,3] | | mace_input_node_data |
I mace/benchmark/statistics.cc:347] | Conv2D | 0.173 | 9.412 | 5.442 | 1.623 | 1.672 | 3.983 | [1,1] | [2,2] | [64,3,3,3] | [1,112,112,64] | | relu0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 5.669 | 10.729 | 9.392 | 2.801 | 4.473 | 12.309 | [2,2] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 15.115 | 11.657 | 9.466 | 2.823 | 7.296 | 12.212 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 24.634 | 2.273 | 3.647 | 1.088 | 8.384 | 3.522 | [2,2] | [0,0] | [64,1,1,64] | [1,56,56,64] | | stage1_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 28.331 | 1.556 | 0.608 | 0.181 | 8.565 | 0.000 | | | | [1,56,56,64] | | _plus0 |
I mace/benchmark/statistics.cc:347] | Conv2D | 28.945 | 12.403 | 8.392 | 2.503 | 11.068 | 13.776 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 37.398 | 11.053 | 9.435 | 2.814 | 13.881 | 12.253 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 46.883 | 2.763 | 1.561 | 0.465 | 14.347 | 0.000 | | | | [1,56,56,64] | | _plus1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 48.461 | 10.481 | 8.409 | 2.508 | 16.855 | 13.747 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 56.923 | 10.095 | 9.541 | 2.845 | 19.700 | 12.117 | [1,1] | [2,2] | [64,3,3,64] | [1,56,56,64] | | stage1_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 66.512 | 1.198 | 1.655 | 0.493 | 20.193 | 0.000 | | | | [1,56,56,64] | | _plus2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 68.175 | 4.302 | 5.464 | 1.630 | 21.823 | 10.578 | [2,2] | [2,2] | [128,3,3,64] | [1,28,28,128] | | stage2_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 73.691 | 7.155 | 8.648 | 2.579 | 24.402 | 13.367 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 82.392 | 3.739 | 2.381 | 0.710 | 25.112 | 2.697 | [2,2] | [0,0] | [128,1,1,64] | [1,28,28,128] | | stage2_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 84.818 | 0.178 | 0.268 | 0.080 | 25.192 | 0.000 | | | | [1,28,28,128] | | _plus3 |
I mace/benchmark/statistics.cc:347] | Conv2D | 85.091 | 7.197 | 7.737 | 2.307 | 27.500 | 14.943 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 92.883 | 7.436 | 9.006 | 2.686 | 30.185 | 12.837 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 101.948 | 0.890 | 1.194 | 0.356 | 30.541 | 0.000 | | | | [1,28,28,128] | | _plus4 |
I mace/benchmark/statistics.cc:347] | Conv2D | 103.148 | 11.591 | 8.025 | 2.393 | 32.935 | 14.406 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 111.222 | 10.474 | 9.004 | 2.685 | 35.620 | 12.839 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 120.279 | 1.343 | 1.552 | 0.463 | 36.083 | 0.000 | | | | [1,28,28,128] | | _plus5 |
I mace/benchmark/statistics.cc:347] | Conv2D | 121.838 | 7.450 | 8.137 | 2.427 | 38.509 | 14.207 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 130.027 | 10.451 | 9.228 | 2.752 | 41.261 | 12.527 | [1,1] | [2,2] | [128,3,3,128] | [1,28,28,128] | | stage2_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 139.312 | 1.416 | 1.099 | 0.328 | 41.589 | 0.000 | | | | [1,28,28,128] | | _plus6 |
I mace/benchmark/statistics.cc:347] | Conv2D | 140.416 | 5.840 | 5.695 | 1.698 | 43.288 | 10.149 | [2,2] | [2,2] | [256,3,3,128] | [1,14,14,256] | | stage3_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 146.166 | 10.927 | 12.394 | 3.696 | 46.984 | 9.327 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 158.619 | 3.450 | 2.077 | 0.620 | 47.603 | 3.092 | [2,2] | [0,0] | [256,1,1,128] | [1,14,14,256] | | stage3_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 160.734 | 0.092 | 0.166 | 0.050 | 47.653 | 0.000 | | | | [1,14,14,256] | | _plus7 |
I mace/benchmark/statistics.cc:347] | Conv2D | 160.904 | 10.274 | 10.785 | 3.216 | 50.869 | 10.719 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 171.745 | 11.628 | 11.592 | 3.457 | 54.326 | 9.973 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 183.395 | 0.567 | 0.610 | 0.182 | 54.508 | 0.000 | | | | [1,14,14,256] | | _plus8 |
I mace/benchmark/statistics.cc:347] | Conv2D | 184.009 | 11.539 | 10.990 | 3.278 | 57.785 | 10.519 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 195.071 | 13.797 | 11.694 | 3.487 | 61.273 | 9.886 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 206.897 | 0.141 | 0.790 | 0.235 | 61.508 | 0.000 | | | | [1,14,14,256] | | _plus9 |
I mace/benchmark/statistics.cc:347] | Conv2D | 207.691 | 12.616 | 11.099 | 3.310 | 64.818 | 10.416 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 218.848 | 11.994 | 11.338 | 3.381 | 68.199 | 10.197 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 230.242 | 0.186 | 0.967 | 0.288 | 68.488 | 0.000 | | | | [1,14,14,256] | | _plus10 |
I mace/benchmark/statistics.cc:347] | Conv2D | 231.214 | 7.813 | 11.092 | 3.308 | 71.795 | 10.422 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 242.403 | 7.547 | 11.386 | 3.395 | 75.191 | 10.154 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 253.854 | 0.100 | 0.739 | 0.220 | 75.411 | 0.000 | | | | [1,14,14,256] | | _plus11 |
I mace/benchmark/statistics.cc:347] | Conv2D | 254.598 | 12.812 | 10.964 | 3.270 | 78.681 | 10.544 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 265.612 | 12.974 | 11.197 | 3.339 | 82.020 | 10.325 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 276.861 | 0.186 | 0.688 | 0.205 | 82.225 | 0.000 | | | | [1,14,14,256] | | _plus12 |
I mace/benchmark/statistics.cc:347] | Conv2D | 277.553 | 5.283 | 5.751 | 1.715 | 83.940 | 10.052 | [2,2] | [2,2] | [512,3,3,256] | [1,7,7,512] | | stage4_unit1_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 283.370 | 7.992 | 9.351 | 2.789 | 86.729 | 12.363 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 292.773 | 3.973 | 2.003 | 0.597 | 87.326 | 3.207 | [2,2] | [0,0] | [512,1,1,256] | [1,7,7,512] | | stage4_unit1_screlu |
I mace/benchmark/statistics.cc:347] | Eltwise | 294.827 | 0.054 | 0.153 | 0.046 | 87.372 | 0.000 | | | | [1,7,7,512] | | _plus13 |
I mace/benchmark/statistics.cc:347] | Conv2D | 294.984 | 9.678 | 8.054 | 2.402 | 89.773 | 14.354 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit2_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 303.089 | 10.430 | 9.298 | 2.773 | 92.546 | 12.434 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 312.440 | 3.590 | 1.110 | 0.331 | 92.877 | 0.000 | | | | [1,7,7,512] | | _plus14 |
I mace/benchmark/statistics.cc:347] | Conv2D | 313.556 | 8.455 | 8.992 | 2.682 | 95.559 | 12.856 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 322.600 | 12.810 | 9.267 | 2.763 | 98.322 | 12.476 | [1,1] | [2,2] | [512,3,3,512] | [1,7,7,512] | | stage4_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Eltwise | 331.921 | 0.827 | 0.867 | 0.259 | 98.581 | 0.000 | | | | [1,7,7,512] | | _plus15 |
I mace/benchmark/statistics.cc:347] | Conv2D | 332.793 | 4.178 | 3.363 | 1.003 | 99.584 | 7.640 | [1,1] | [0,0] | [1024,1,1,512] | [1,7,7,1024] | | convx1 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 336.203 | 1.864 | 0.697 | 0.208 | 99.792 | 10.535 | [1,1] | [0,0] | [7,7,1024,1] | [1,1,1,1024] | | conv_6dw7_7_conv2d |
I mace/benchmark/statistics.cc:347] | Conv2D | 336.925 | 0.536 | 0.664 | 0.198 | 99.990 | 0.789 | [1,1] | [0,0] | [512,1,1,1024] | [1,1,1,512] | | mace_output_node_pre_fc1 |
I mace/benchmark/statistics.cc:347] | Dequantize | 337.626 | 0.074 | 0.035 | 0.010 | 100.000 | 0.000 | | | | [1,1,1,512] | | fc1bn |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Sort by Computation Time
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Start | First | Avg(ms) | % | cdf% | GMACPS | Stride | Pad | Filter Shape | Output Shape | Dilation | name |
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 146.166 | 10.927 | 12.394 | 3.696 | 3.696 | 9.327 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit1_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 195.071 | 13.797 | 11.694 | 3.487 | 7.184 | 9.886 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 171.745 | 11.628 | 11.592 | 3.457 | 10.641 | 9.973 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit2_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 242.403 | 7.547 | 11.386 | 3.395 | 14.036 | 10.154 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 218.848 | 11.994 | 11.338 | 3.381 | 17.417 | 10.197 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 265.612 | 12.974 | 11.197 | 3.339 | 20.756 | 10.325 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu2 |
I mace/benchmark/statistics.cc:347] | Conv2D | 207.691 | 12.616 | 11.099 | 3.310 | 24.066 | 10.416 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit4_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 231.214 | 7.813 | 11.092 | 3.308 | 27.374 | 10.422 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit5_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 184.009 | 11.539 | 10.990 | 3.278 | 30.651 | 10.519 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit3_relu1 |
I mace/benchmark/statistics.cc:347] | Conv2D | 254.598 | 12.812 | 10.964 | 3.270 | 33.921 | 10.544 | [1,1] | [2,2] | [256,3,3,256] | [1,14,14,256] | | stage3_unit6_relu1 |
I mace/benchmark/statistics.cc:347] ---------------------------------------------------------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by Op Type
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Op Type | Count | Avg(ms) | % | cdf% | MACs | GMACPS | Called times |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | Conv2D | 39 | 320.382 | 95.553 | 95.553 | 3,605,970,944 | 11.255 | 39 |
I mace/benchmark/statistics.cc:347] | Eltwise | 16 | 14.017 | 4.181 | 99.733 | 0 | 0.000 | 16 |
I mace/benchmark/statistics.cc:347] | DepthwiseConv2d | 1 | 0.696 | 0.208 | 99.941 | 7,340,032 | 10.546 | 1 |
I mace/benchmark/statistics.cc:347] | Quantize | 1 | 0.165 | 0.049 | 99.990 | 0 | 0.000 | 1 |
I mace/benchmark/statistics.cc:347] | Dequantize | 1 | 0.034 | 0.010 | 100.000 | 0 | 0.000 | 1 |
I mace/benchmark/statistics.cc:347] ------------------------------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Stat by MACs(Multiply-Accumulation)
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | total | round | first(G/s) | avg(G/s) | std |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 3,613,310,976 | 30 | 9.993 | 10.776 | 12957.633 |
I mace/benchmark/statistics.cc:347] -------------------------------------------------------------
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] Summary of Ops' Stat
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) | std |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347] | 30 | 361.590 | 328.352 | 307.066 | 361.590 | 335.323 | 12957.633 |
I mace/benchmark/statistics.cc:347] --------------------------------------------------------------------------
I mace/benchmark/statistics.cc:347]
I mace/benchmark/statistics.cc:347] 58 ops total.
I mace/libmace/mace.cc:603] Destroying MaceEngine
It looks like many ops are slower when quantized?
I also see a big difference in times provided by benchmark
and run
for target msmnile
, which one do I have to trust?
*************************************************************
Run model sp on msmnile
*************************************************************
I mace/tools/validation/mace_run.cc:451] model name: sp
I mace/tools/validation/mace_run.cc:452] mace version: v0.11.0-rc0-0-g2d650b6
I mace/tools/validation/mace_run.cc:453] input node: data
I mace/tools/validation/mace_run.cc:454] input shape: 1,3,112,112
I mace/tools/validation/mace_run.cc:455] output node: fc1bn
I mace/tools/validation/mace_run.cc:456] output shape: 1,1,1,512
I mace/tools/validation/mace_run.cc:457] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/validation/mace_run.cc:458] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/validation/mace_run.cc:459] model_data_file:
I mace/tools/validation/mace_run.cc:460] model_file:
I mace/tools/validation/mace_run.cc:461] device: CPU
I mace/tools/validation/mace_run.cc:462] round: 100
I mace/tools/validation/mace_run.cc:463] restart_round: 1
I mace/tools/validation/mace_run.cc:464] gpu_perf_hint: 3
I mace/tools/validation/mace_run.cc:465] gpu_priority_hint: 3
I mace/tools/validation/mace_run.cc:466] omp_num_threads: -1
I mace/tools/validation/mace_run.cc:467] cpu_affinity_policy: 1
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/libmace/mace.cc:603] Destroying MaceEngine
I mace/tools/validation/mace_run.cc:508] restart round 0
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/tools/validation/mace_run.cc:265] Create Mace Engine latency: 6.535 ms
I mace/tools/validation/mace_run.cc:272] Total init latency: 6.622 ms
I mace/tools/validation/mace_run.cc:313] Warm up run
I mace/tools/validation/mace_run.cc:349] 1st warm up run latency: 227.403 ms
I mace/tools/validation/mace_run.cc:356] Run model
I mace/tools/validation/mace_run.cc:407] Average latency: 92.6602 ms
========================================================
capability(CPU) init warmup run_avg
========================================================
time 18.459 6.622 227.403 92.660
I mace/tools/validation/mace_run.cc:430] Write output file /data/local/tmp/mace_run/model_out_fc1bn with size 2048 done.
I mace/libmace/mace.cc:603] Destroying MaceEngine
Running finished!
*************************************************************
Run model sp on msmnile
*************************************************************
I mace/tools/validation/mace_run.cc:451] model name: sp
I mace/tools/validation/mace_run.cc:452] mace version: v0.11.0-rc0-0-g2d650b6
I mace/tools/validation/mace_run.cc:453] input node: data
I mace/tools/validation/mace_run.cc:454] input shape: 1,3,112,112
I mace/tools/validation/mace_run.cc:455] output node: fc1bn
I mace/tools/validation/mace_run.cc:456] output shape: 1,1,1,512
I mace/tools/validation/mace_run.cc:457] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/validation/mace_run.cc:458] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/validation/mace_run.cc:459] model_data_file:
I mace/tools/validation/mace_run.cc:460] model_file:
I mace/tools/validation/mace_run.cc:461] device: CPU
I mace/tools/validation/mace_run.cc:462] round: 100
I mace/tools/validation/mace_run.cc:463] restart_round: 1
I mace/tools/validation/mace_run.cc:464] gpu_perf_hint: 3
I mace/tools/validation/mace_run.cc:465] gpu_priority_hint: 3
I mace/tools/validation/mace_run.cc:466] omp_num_threads: -1
I mace/tools/validation/mace_run.cc:467] cpu_affinity_policy: 1
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/libmace/mace.cc:603] Destroying MaceEngine
I mace/tools/validation/mace_run.cc:508] restart round 0
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/tools/validation/mace_run.cc:265] Create Mace Engine latency: 5.385 ms
I mace/tools/validation/mace_run.cc:272] Total init latency: 5.442 ms
I mace/tools/validation/mace_run.cc:313] Warm up run
I mace/tools/validation/mace_run.cc:349] 1st warm up run latency: 72.696 ms
I mace/tools/validation/mace_run.cc:356] Run model
I mace/tools/validation/mace_run.cc:407] Average latency: 60.1606 ms
========================================================
capability(CPU) init warmup run_avg
========================================================
time 18.838 5.442 72.696 60.161
I mace/tools/validation/mace_run.cc:430] Write output file /data/local/tmp/mace_run/model_out_fc1bn with size 2048 done.
I mace/libmace/mace.cc:603] Destroying MaceEngine
Running finished!
*************************************************************
Run model sp on qcs605
*************************************************************
I mace/tools/validation/mace_run.cc:451] model name: sp
I mace/tools/validation/mace_run.cc:452] mace version: v0.11.0-rc0-0-g2d650b6
I mace/tools/validation/mace_run.cc:453] input node: data
I mace/tools/validation/mace_run.cc:454] input shape: 1,3,112,112
I mace/tools/validation/mace_run.cc:455] output node: fc1bn
I mace/tools/validation/mace_run.cc:456] output shape: 1,1,1,512
I mace/tools/validation/mace_run.cc:457] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/validation/mace_run.cc:458] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/validation/mace_run.cc:459] model_data_file:
I mace/tools/validation/mace_run.cc:460] model_file:
I mace/tools/validation/mace_run.cc:461] device: CPU
I mace/tools/validation/mace_run.cc:462] round: 100
I mace/tools/validation/mace_run.cc:463] restart_round: 1
I mace/tools/validation/mace_run.cc:464] gpu_perf_hint: 3
I mace/tools/validation/mace_run.cc:465] gpu_priority_hint: 3
I mace/tools/validation/mace_run.cc:466] omp_num_threads: -1
I mace/tools/validation/mace_run.cc:467] cpu_affinity_policy: 1
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/libmace/mace.cc:603] Destroying MaceEngine
I mace/tools/validation/mace_run.cc:508] restart round 0
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/tools/validation/mace_run.cc:265] Create Mace Engine latency: 28.322 ms
I mace/tools/validation/mace_run.cc:272] Total init latency: 28.615 ms
I mace/tools/validation/mace_run.cc:313] Warm up run
I mace/tools/validation/mace_run.cc:349] 1st warm up run latency: 1564.62 ms
I mace/tools/validation/mace_run.cc:356] Run model
I mace/tools/validation/mace_run.cc:407] Average latency: 339.838 ms
========================================================
capability(CPU) init warmup run_avg
========================================================
time 34.125 28.615 1564.618 339.838
I mace/tools/validation/mace_run.cc:430] Write output file /data/local/tmp/mace_run/model_out_fc1bn with size 2048 done.
I mace/libmace/mace.cc:603] Destroying MaceEngine
Running finished!
*************************************************************
Run model sp on qcs605
*************************************************************
I mace/tools/validation/mace_run.cc:451] model name: sp
I mace/tools/validation/mace_run.cc:452] mace version: v0.11.0-rc0-0-g2d650b6
I mace/tools/validation/mace_run.cc:453] input node: data
I mace/tools/validation/mace_run.cc:454] input shape: 1,3,112,112
I mace/tools/validation/mace_run.cc:455] output node: fc1bn
I mace/tools/validation/mace_run.cc:456] output shape: 1,1,1,512
I mace/tools/validation/mace_run.cc:457] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/validation/mace_run.cc:458] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/validation/mace_run.cc:459] model_data_file:
I mace/tools/validation/mace_run.cc:460] model_file:
I mace/tools/validation/mace_run.cc:461] device: CPU
I mace/tools/validation/mace_run.cc:462] round: 100
I mace/tools/validation/mace_run.cc:463] restart_round: 1
I mace/tools/validation/mace_run.cc:464] gpu_perf_hint: 3
I mace/tools/validation/mace_run.cc:465] gpu_priority_hint: 3
I mace/tools/validation/mace_run.cc:466] omp_num_threads: -1
I mace/tools/validation/mace_run.cc:467] cpu_affinity_policy: 1
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/libmace/mace.cc:603] Destroying MaceEngine
I mace/tools/validation/mace_run.cc:508] restart round 0
I mace/libmace/mace.cc:431] Creating MaceEngine, MACE version: v0.11.0-rc0-0-g2d650b6
I mace/libmace/mace.cc:470] Initializing MaceEngine
I mace/tools/validation/mace_run.cc:265] Create Mace Engine latency: 31.947 ms
I mace/tools/validation/mace_run.cc:272] Total init latency: 32.216 ms
I mace/tools/validation/mace_run.cc:313] Warm up run
I mace/tools/validation/mace_run.cc:349] 1st warm up run latency: 464.248 ms
I mace/tools/validation/mace_run.cc:356] Run model
I mace/tools/validation/mace_run.cc:407] Average latency: 337.925 ms
========================================================
capability(CPU) init warmup run_avg
========================================================
time 35.085 32.216 464.248 337.925
I mace/tools/validation/mace_run.cc:430] Write output file /data/local/tmp/mace_run/model_out_fc1bn with size 2048 done.
I mace/libmace/mace.cc:603] Destroying MaceEngine
Running finished!
Before you open an issue, please make sure you have tried the following steps:
System information
Model deploy file (*.yml)
Describe the problem
Segmentation fault happens when running quantized depthwise conv2d.
To Reproduce
Steps to reproduce the problem:
Error information / logs
Please include the full log and/or traceback here. https://gist.github.com/gasgallo/619eb23800d7caf46e6e97ed23bfc38a
Additional context
Models runs fine w/o quantization.