OAID / Caffe-HRT

Heterogeneous Run Time version of Caffe. Added heterogeneous capabilities to the Caffe, uses heterogeneous computing infrastructure framework to speed up Deep Learning on Arm-based heterogeneous embedded platform. It also retains all the features of the original Caffe architecture which users deploy their applications seamlessly.
Other
269 stars 99 forks source link

forward will use more time when enable ACL #14

Open haolongzhangm opened 6 years ago

haolongzhangm commented 6 years ago

Issue summary

forward will use more time when enable ACL

Steps to reproduce

1:build https://github.com/ARM-software/ComputeLibrary by command: scons Werror=1 -j8 debug=0 asserts=1 neon=1 opencl=1 embed_kernels=1 os=android arch=arm64-v8a 2:build ACLCAFFE to android platform by enable env: export ACL_DIR=${ANDROID_LIB_ROOT}/ComputeLibrary -DCPU_ONLY=ON -DUSE_PROFILING=ON -DUSE_ACL=ON \

3: run caffe on android MTK/QCOM arm platform by test mnist 4: use the same model and protobuf,

forward test mnist will take
0m20.48s to 30 iterations

ps : at the same code , only set -DUSE_ACL=OFF \ (meanings use CPU only ) forward test mnist just take 0m15.26s to 30 iterations

why cpu only more efficient than NEON+GPU ?

ps: USE mtk arm64 chip with MALI T88 GPU

If you are having difficulty building Caffe or training a model, please ask the caffe-users mailing list. If you are reporting a build error that seems to be due to a bug in Caffe, please attach your build configuration (either Makefile.config or CMakeCache.txt) and the output of the make (or cmake) command.

Your system configuration

Operating system: Compiler: CUDA version (if applicable): CUDNN version (if applicable): BLAS: Python or MATLAB version (for pycaffe and matcaffe respectively):

haolongzhangm commented 6 years ago

mtk_disable_acl.log ======================== WARNING: Logging before InitGoogleLogging() is written to STDERR I0112 18:18:59.234746 4927 caffe.cpp:288] Use CPU. I0112 18:18:59.248070 4927 net.cpp:296] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist I0112 18:18:59.248433 4927 net.cpp:53] Initializing net from parameters: name: "LeNet" state { phase: TEST level: 0 stage: "" } layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "/data/caffe_eg/mnist/mnist_test_lmdb" batch_size: 100 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1" } layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss" } I0112 18:18:59.250512 4927 layer_factory.hpp:77] Creating layer mnist I0112 18:18:59.251114 4927 db_lmdb.cpp:35] Opened lmdb /data/caffe_eg/mnist/mnist_test_lmdb I0112 18:18:59.251270 4927 net.cpp:86] Creating Layer mnist I0112 18:18:59.251361 4927 net.cpp:382] mnist -> data I0112 18:18:59.251541 4927 net.cpp:382] mnist -> label I0112 18:18:59.251770 4927 data_layer.cpp:45] output data size: 100,1,28,28 I0112 18:18:59.253125 4927 base_data_layer.cpp:72] Initializing prefetch I0112 18:18:59.254364 4927 base_data_layer.cpp:75] Prefetch initialized. I0112 18:18:59.254442 4927 net.cpp:124] Setting up mnist I0112 18:18:59.254505 4927 net.cpp:131] Top shape: 100 1 28 28 (78400) I0112 18:18:59.254591 4927 net.cpp:131] Top shape: 100 (100) I0112 18:18:59.254644 4927 net.cpp:139] Memory required for data: 314000 I0112 18:18:59.254716 4927 layer_factory.hpp:77] Creating layer label_mnist_1_split I0112 18:18:59.254834 4927 net.cpp:86] Creating Layer label_mnist_1_split I0112 18:18:59.254896 4927 net.cpp:408] label_mnist_1_split <- label I0112 18:18:59.254997 4927 net.cpp:382] label_mnist_1_split -> label_mnist_1_split_0 I0112 18:18:59.255122 4927 net.cpp:382] label_mnist_1_split -> label_mnist_1_split_1 I0112 18:18:59.255219 4927 net.cpp:124] Setting up label_mnist_1_split I0112 18:18:59.255280 4927 net.cpp:131] Top shape: 100 (100) I0112 18:18:59.255338 4927 net.cpp:131] Top shape: 100 (100) I0112 18:18:59.255391 4927 net.cpp:139] Memory required for data: 314800 I0112 18:18:59.255445 4927 layer_factory.hpp:77] Creating layer conv1 I0112 18:18:59.255570 4927 net.cpp:86] Creating Layer conv1 I0112 18:18:59.255625 4927 net.cpp:408] conv1 <- data I0112 18:18:59.255730 4927 net.cpp:382] conv1 -> conv1 I0112 18:18:59.256273 4927 net.cpp:124] Setting up conv1 I0112 18:18:59.256342 4927 net.cpp:131] Top shape: 100 20 24 24 (1152000) I0112 18:18:59.256402 4927 net.cpp:139] Memory required for data: 4922800 I0112 18:18:59.256543 4927 layer_factory.hpp:77] Creating layer pool1 I0112 18:18:59.256656 4927 net.cpp:86] Creating Layer pool1 I0112 18:18:59.256715 4927 net.cpp:408] pool1 <- conv1 I0112 18:18:59.256790 4927 net.cpp:382] pool1 -> pool1 I0112 18:18:59.256947 4927 net.cpp:124] Setting up pool1 I0112 18:18:59.257004 4927 net.cpp:131] Top shape: 100 20 12 12 (288000) I0112 18:18:59.257063 4927 net.cpp:139] Memory required for data: 6074800 I0112 18:18:59.257137 4927 layer_factory.hpp:77] Creating layer conv2 I0112 18:18:59.257249 4927 net.cpp:86] Creating Layer conv2 I0112 18:18:59.257304 4927 net.cpp:408] conv2 <- pool1 I0112 18:18:59.257387 4927 net.cpp:382] conv2 -> conv2 I0112 18:18:59.265692 4928 data_layer.cpp:128] Prefetch batch: 10 ms. I0112 18:18:59.266127 4928 data_layer.cpp:129] Read time: 0.97 ms. I0112 18:18:59.266190 4928 data_layer.cpp:130] Transform time: 8.74 ms. I0112 18:18:59.267087 4927 net.cpp:124] Setting up conv2 I0112 18:18:59.267189 4927 net.cpp:131] Top shape: 100 50 8 8 (320000) I0112 18:18:59.267253 4927 net.cpp:139] Memory required for data: 7354800 I0112 18:18:59.267407 4927 layer_factory.hpp:77] Creating layer pool2 I0112 18:18:59.267520 4927 net.cpp:86] Creating Layer pool2 I0112 18:18:59.267583 4927 net.cpp:408] pool2 <- conv2 I0112 18:18:59.267673 4927 net.cpp:382] pool2 -> pool2 I0112 18:18:59.267827 4927 net.cpp:124] Setting up pool2 I0112 18:18:59.267879 4927 net.cpp:131] Top shape: 100 50 4 4 (80000) I0112 18:18:59.267937 4927 net.cpp:139] Memory required for data: 7674800 I0112 18:18:59.267990 4927 layer_factory.hpp:77] Creating layer ip1 I0112 18:18:59.268086 4927 net.cpp:86] Creating Layer ip1 I0112 18:18:59.268161 4927 net.cpp:408] ip1 <- pool2 I0112 18:18:59.268246 4927 net.cpp:382] ip1 -> ip1 I0112 18:18:59.277564 4928 data_layer.cpp:128] Prefetch batch: 11 ms. I0112 18:18:59.277770 4928 data_layer.cpp:129] Read time: 1.834 ms. I0112 18:18:59.277832 4928 data_layer.cpp:130] Transform time: 8.357 ms. I0112 18:18:59.287046 4928 data_layer.cpp:128] Prefetch batch: 9 ms. I0112 18:18:59.287221 4928 data_layer.cpp:129] Read time: 0.851 ms. I0112 18:18:59.287283 4928 data_layer.cpp:130] Transform time: 7.232 ms. I0112 18:18:59.296427 4928 data_layer.cpp:128] Prefetch batch: 9 ms. I0112 18:18:59.296596 4928 data_layer.cpp:129] Read time: 0.844 ms. I0112 18:18:59.296668 4928 data_layer.cpp:130] Transform time: 7.216 ms. I0112 18:18:59.394999 4927 net.cpp:124] Setting up ip1 I0112 18:18:59.395175 4927 net.cpp:131] Top shape: 100 500 (50000) I0112 18:18:59.395232 4927 net.cpp:139] Memory required for data: 7874800 I0112 18:18:59.395395 4927 layer_factory.hpp:77] Creating layer relu1 I0112 18:18:59.395499 4927 net.cpp:86] Creating Layer relu1 I0112 18:18:59.395555 4927 net.cpp:408] relu1 <- ip1 I0112 18:18:59.395621 4927 net.cpp:369] relu1 -> ip1 (in-place) I0112 18:18:59.395695 4927 net.cpp:124] Setting up relu1 I0112 18:18:59.395731 4927 net.cpp:131] Top shape: 100 500 (50000) I0112 18:18:59.395773 4927 net.cpp:139] Memory required for data: 8074800 I0112 18:18:59.395812 4927 layer_factory.hpp:77] Creating layer ip2 I0112 18:18:59.395889 4927 net.cpp:86] Creating Layer ip2 I0112 18:18:59.395927 4927 net.cpp:408] ip2 <- ip1 I0112 18:18:59.395985 4927 net.cpp:382] ip2 -> ip2 I0112 18:18:59.397521 4927 net.cpp:124] Setting up ip2 I0112 18:18:59.397605 4927 net.cpp:131] Top shape: 100 10 (1000) I0112 18:18:59.397651 4927 net.cpp:139] Memory required for data: 8078800 I0112 18:18:59.397737 4927 layer_factory.hpp:77] Creating layer ip2_ip2_0_split I0112 18:18:59.397806 4927 net.cpp:86] Creating Layer ip2_ip2_0_split I0112 18:18:59.397848 4927 net.cpp:408] ip2_ip2_0_split <- ip2 I0112 18:18:59.397905 4927 net.cpp:382] ip2_ip2_0_split -> ip2_ip2_0_split_0 I0112 18:18:59.397965 4927 net.cpp:382] ip2_ip2_0_split -> ip2_ip2_0_split_1 I0112 18:18:59.398030 4927 net.cpp:124] Setting up ip2_ip2_0_split I0112 18:18:59.398069 4927 net.cpp:131] Top shape: 100 10 (1000) I0112 18:18:59.398114 4927 net.cpp:131] Top shape: 100 10 (1000) I0112 18:18:59.398154 4927 net.cpp:139] Memory required for data: 8086800 I0112 18:18:59.398195 4927 layer_factory.hpp:77] Creating layer accuracy I0112 18:18:59.398280 4927 net.cpp:86] Creating Layer accuracy I0112 18:18:59.398319 4927 net.cpp:408] accuracy <- ip2_ip2_0_split_0 I0112 18:18:59.398378 4927 net.cpp:408] accuracy <- label_mnist_1_split_0 I0112 18:18:59.398430 4927 net.cpp:382] accuracy -> accuracy I0112 18:18:59.398494 4927 net.cpp:124] Setting up accuracy I0112 18:18:59.398531 4927 net.cpp:131] Top shape: (1) I0112 18:18:59.398570 4927 net.cpp:139] Memory required for data: 8086804 I0112 18:18:59.398607 4927 layer_factory.hpp:77] Creating layer loss I0112 18:18:59.398658 4927 net.cpp:86] Creating Layer loss I0112 18:18:59.398697 4927 net.cpp:408] loss <- ip2_ip2_0_split_1 I0112 18:18:59.398746 4927 net.cpp:408] loss <- label_mnist_1_split_1 I0112 18:18:59.398797 4927 net.cpp:382] loss -> loss I0112 18:18:59.398874 4927 layer_factory.hpp:77] Creating layer loss I0112 18:18:59.399009 4927 net.cpp:124] Setting up loss I0112 18:18:59.399049 4927 net.cpp:131] Top shape: (1) I0112 18:18:59.399091 4927 net.cpp:134] with loss weight 1 I0112 18:18:59.399163 4927 net.cpp:139] Memory required for data: 8086808 I0112 18:18:59.399206 4927 net.cpp:200] loss needs backward computation. I0112 18:18:59.399247 4927 net.cpp:202] accuracy does not need backward computation. I0112 18:18:59.399293 4927 net.cpp:200] ip2_ip2_0_split needs backward computation. I0112 18:18:59.399335 4927 net.cpp:200] ip2 needs backward computation. I0112 18:18:59.399373 4927 net.cpp:200] relu1 needs backward computation. I0112 18:18:59.399410 4927 net.cpp:200] ip1 needs backward computation. I0112 18:18:59.399449 4927 net.cpp:200] pool2 needs backward computation. I0112 18:18:59.399488 4927 net.cpp:200] conv2 needs backward computation. I0112 18:18:59.399616 4927 net.cpp:200] pool1 needs backward computation. I0112 18:18:59.399658 4927 net.cpp:200] conv1 needs backward computation. I0112 18:18:59.399703 4927 net.cpp:202] label_mnist_1_split does not need backward computation. I0112 18:18:59.399745 4927 net.cpp:202] mnist does not need backward computation. I0112 18:18:59.399791 4927 net.cpp:244] This network produces output accuracy I0112 18:18:59.399834 4927 net.cpp:244] This network produces output loss I0112 18:18:59.399955 4927 net.cpp:257] Network initialization done. I0112 18:18:59.444804 4927 net.cpp:749] Copying source layer mnist I0112 18:18:59.445011 4927 net.cpp:749] Copying source layer conv1 I0112 18:18:59.445123 4927 net.cpp:749] Copying source layer pool1 I0112 18:18:59.445164 4927 net.cpp:749] Copying source layer conv2 I0112 18:18:59.446504 4927 net.cpp:749] Copying source layer pool2 I0112 18:18:59.446604 4927 net.cpp:749] Copying source layer ip1 I0112 18:18:59.465939 4927 net.cpp:749] Copying source layer relu1 I0112 18:18:59.466148 4927 net.cpp:749] Copying source layer ip2 I0112 18:18:59.466502 4927 net.cpp:749] Copying source layer loss I0112 18:18:59.469476 4927 caffe.cpp:294] Running for 30 iterations. I0112 18:18:59.975335 4927 caffe.cpp:317] Batch 0, accuracy = 1 I0112 18:18:59.975615 4927 caffe.cpp:317] Batch 0, loss = 0.0135883 I0112 18:18:59.982455 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:18:59.982668 4928 data_layer.cpp:129] Read time: 0.68 ms. I0112 18:18:59.982718 4928 data_layer.cpp:130] Transform time: 5.255 ms. I0112 18:19:00.474658 4927 caffe.cpp:317] Batch 1, accuracy = 0.99 I0112 18:19:00.474936 4927 caffe.cpp:317] Batch 1, loss = 0.0130389 I0112 18:19:00.481828 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:00.482044 4928 data_layer.cpp:129] Read time: 0.739 ms. I0112 18:19:00.482094 4928 data_layer.cpp:130] Transform time: 5.229 ms. I0112 18:19:00.973535 4927 caffe.cpp:317] Batch 2, accuracy = 0.99 I0112 18:19:00.973796 4927 caffe.cpp:317] Batch 2, loss = 0.0159064 I0112 18:19:00.980622 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:00.980806 4928 data_layer.cpp:129] Read time: 0.641 ms. I0112 18:19:00.980851 4928 data_layer.cpp:130] Transform time: 5.224 ms. I0112 18:19:01.472671 4927 caffe.cpp:317] Batch 3, accuracy = 0.99 I0112 18:19:01.472934 4927 caffe.cpp:317] Batch 3, loss = 0.025398 I0112 18:19:01.479765 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:01.479969 4928 data_layer.cpp:129] Read time: 0.642 ms. I0112 18:19:01.480016 4928 data_layer.cpp:130] Transform time: 5.286 ms. I0112 18:19:01.971870 4927 caffe.cpp:317] Batch 4, accuracy = 0.99 I0112 18:19:01.972134 4927 caffe.cpp:317] Batch 4, loss = 0.0665555 I0112 18:19:01.978899 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:01.979137 4928 data_layer.cpp:129] Read time: 0.674 ms. I0112 18:19:01.979188 4928 data_layer.cpp:130] Transform time: 5.194 ms. I0112 18:19:02.471039 4927 caffe.cpp:317] Batch 5, accuracy = 0.99 I0112 18:19:02.471273 4927 caffe.cpp:317] Batch 5, loss = 0.0517147 I0112 18:19:02.478014 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:02.478255 4928 data_layer.cpp:129] Read time: 0.664 ms. I0112 18:19:02.478307 4928 data_layer.cpp:130] Transform time: 5.194 ms. I0112 18:19:02.969833 4927 caffe.cpp:317] Batch 6, accuracy = 0.98 I0112 18:19:02.970077 4927 caffe.cpp:317] Batch 6, loss = 0.0685784 I0112 18:19:02.976835 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:02.977072 4928 data_layer.cpp:129] Read time: 0.638 ms. I0112 18:19:02.977123 4928 data_layer.cpp:130] Transform time: 5.194 ms. I0112 18:19:03.468849 4927 caffe.cpp:317] Batch 7, accuracy = 0.99 I0112 18:19:03.469127 4927 caffe.cpp:317] Batch 7, loss = 0.016397 I0112 18:19:03.475927 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:03.476166 4928 data_layer.cpp:129] Read time: 0.637 ms. I0112 18:19:03.476215 4928 data_layer.cpp:130] Transform time: 5.258 ms. I0112 18:19:03.967943 4927 caffe.cpp:317] Batch 8, accuracy = 1 I0112 18:19:03.968226 4927 caffe.cpp:317] Batch 8, loss = 0.010708 I0112 18:19:03.975049 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:03.975281 4928 data_layer.cpp:129] Read time: 0.642 ms. I0112 18:19:03.975332 4928 data_layer.cpp:130] Transform time: 5.239 ms. I0112 18:19:04.467239 4927 caffe.cpp:317] Batch 9, accuracy = 0.99 I0112 18:19:04.467501 4927 caffe.cpp:317] Batch 9, loss = 0.0439754 I0112 18:19:04.474278 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:04.474514 4928 data_layer.cpp:129] Read time: 0.636 ms. I0112 18:19:04.474563 4928 data_layer.cpp:130] Transform time: 5.235 ms. I0112 18:19:04.965932 4927 caffe.cpp:317] Batch 10, accuracy = 0.98 I0112 18:19:04.966205 4927 caffe.cpp:317] Batch 10, loss = 0.0561715 I0112 18:19:04.973026 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:04.973227 4928 data_layer.cpp:129] Read time: 0.641 ms. I0112 18:19:04.973274 4928 data_layer.cpp:130] Transform time: 5.268 ms. I0112 18:19:05.464938 4927 caffe.cpp:317] Batch 11, accuracy = 0.97 I0112 18:19:05.465180 4927 caffe.cpp:317] Batch 11, loss = 0.0499199 I0112 18:19:05.472012 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:05.472225 4928 data_layer.cpp:129] Read time: 0.642 ms. I0112 18:19:05.472275 4928 data_layer.cpp:130] Transform time: 5.277 ms. I0112 18:19:05.963747 4927 caffe.cpp:317] Batch 12, accuracy = 0.96 I0112 18:19:05.963994 4927 caffe.cpp:317] Batch 12, loss = 0.126818 I0112 18:19:05.970829 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:05.971026 4928 data_layer.cpp:129] Read time: 0.644 ms. I0112 18:19:05.971075 4928 data_layer.cpp:130] Transform time: 5.245 ms. I0112 18:19:06.462350 4927 caffe.cpp:317] Batch 13, accuracy = 0.98 I0112 18:19:06.462580 4927 caffe.cpp:317] Batch 13, loss = 0.042856 I0112 18:19:06.469331 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:06.469631 4928 data_layer.cpp:129] Read time: 0.63 ms. I0112 18:19:06.469693 4928 data_layer.cpp:130] Transform time: 5.201 ms. I0112 18:19:06.961202 4927 caffe.cpp:317] Batch 14, accuracy = 0.99 I0112 18:19:06.961447 4927 caffe.cpp:317] Batch 14, loss = 0.0221329 I0112 18:19:06.968205 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:06.968438 4928 data_layer.cpp:129] Read time: 0.634 ms. I0112 18:19:06.968489 4928 data_layer.cpp:130] Transform time: 5.235 ms. I0112 18:19:07.459849 4927 caffe.cpp:317] Batch 15, accuracy = 0.97 I0112 18:19:07.460063 4927 caffe.cpp:317] Batch 15, loss = 0.0545126 I0112 18:19:07.466822 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:07.467059 4928 data_layer.cpp:129] Read time: 0.632 ms. I0112 18:19:07.467108 4928 data_layer.cpp:130] Transform time: 5.219 ms. I0112 18:19:07.958456 4927 caffe.cpp:317] Batch 16, accuracy = 0.99 I0112 18:19:07.958734 4927 caffe.cpp:317] Batch 16, loss = 0.0278128 I0112 18:19:07.965502 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:07.965739 4928 data_layer.cpp:129] Read time: 0.642 ms. I0112 18:19:07.965790 4928 data_layer.cpp:130] Transform time: 5.226 ms. I0112 18:19:08.456940 4927 caffe.cpp:317] Batch 17, accuracy = 0.99 I0112 18:19:08.457226 4927 caffe.cpp:317] Batch 17, loss = 0.0294755 I0112 18:19:08.464036 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:08.464251 4928 data_layer.cpp:129] Read time: 0.692 ms. I0112 18:19:08.464300 4928 data_layer.cpp:130] Transform time: 5.191 ms. I0112 18:19:08.955890 4927 caffe.cpp:317] Batch 18, accuracy = 1 I0112 18:19:08.956151 4927 caffe.cpp:317] Batch 18, loss = 0.00674172 I0112 18:19:08.962944 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:08.963161 4928 data_layer.cpp:129] Read time: 0.635 ms. I0112 18:19:08.963212 4928 data_layer.cpp:130] Transform time: 5.268 ms. I0112 18:19:09.454437 4927 caffe.cpp:317] Batch 19, accuracy = 0.99 I0112 18:19:09.454717 4927 caffe.cpp:317] Batch 19, loss = 0.0572211 I0112 18:19:09.461544 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:09.461732 4928 data_layer.cpp:129] Read time: 0.644 ms. I0112 18:19:09.461781 4928 data_layer.cpp:130] Transform time: 5.27 ms. I0112 18:19:09.953292 4927 caffe.cpp:317] Batch 20, accuracy = 0.98 I0112 18:19:09.953536 4927 caffe.cpp:317] Batch 20, loss = 0.0989483 I0112 18:19:09.960389 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:09.960572 4928 data_layer.cpp:129] Read time: 0.641 ms. I0112 18:19:09.960616 4928 data_layer.cpp:130] Transform time: 5.229 ms. I0112 18:19:10.452282 4927 caffe.cpp:317] Batch 21, accuracy = 0.98 I0112 18:19:10.452569 4927 caffe.cpp:317] Batch 21, loss = 0.046975 I0112 18:19:10.459321 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:10.459615 4928 data_layer.cpp:129] Read time: 0.639 ms. I0112 18:19:10.459681 4928 data_layer.cpp:130] Transform time: 5.216 ms. I0112 18:19:10.950994 4927 caffe.cpp:317] Batch 22, accuracy = 0.99 I0112 18:19:10.951226 4927 caffe.cpp:317] Batch 22, loss = 0.039 I0112 18:19:10.957927 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:10.958166 4928 data_layer.cpp:129] Read time: 0.627 ms. I0112 18:19:10.958217 4928 data_layer.cpp:130] Transform time: 5.197 ms. I0112 18:19:11.449833 4927 caffe.cpp:317] Batch 23, accuracy = 0.98 I0112 18:19:11.450093 4927 caffe.cpp:317] Batch 23, loss = 0.0288257 I0112 18:19:11.456837 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:11.457073 4928 data_layer.cpp:129] Read time: 0.636 ms. I0112 18:19:11.457124 4928 data_layer.cpp:130] Transform time: 5.221 ms. I0112 18:19:11.948463 4927 caffe.cpp:317] Batch 24, accuracy = 0.98 I0112 18:19:11.948744 4927 caffe.cpp:317] Batch 24, loss = 0.0627499 I0112 18:19:11.955555 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:11.955797 4928 data_layer.cpp:129] Read time: 0.635 ms. I0112 18:19:11.955847 4928 data_layer.cpp:130] Transform time: 5.236 ms. I0112 18:19:12.447453 4927 caffe.cpp:317] Batch 25, accuracy = 0.99 I0112 18:19:12.447733 4927 caffe.cpp:317] Batch 25, loss = 0.0770601 I0112 18:19:12.454560 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:12.454789 4928 data_layer.cpp:129] Read time: 0.634 ms. I0112 18:19:12.454840 4928 data_layer.cpp:130] Transform time: 5.269 ms. I0112 18:19:12.946315 4927 caffe.cpp:317] Batch 26, accuracy = 0.99 I0112 18:19:12.946588 4927 caffe.cpp:317] Batch 26, loss = 0.0888426 I0112 18:19:12.953423 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:12.953652 4928 data_layer.cpp:129] Read time: 0.641 ms. I0112 18:19:12.953701 4928 data_layer.cpp:130] Transform time: 5.254 ms. I0112 18:19:13.445303 4927 caffe.cpp:317] Batch 27, accuracy = 0.99 I0112 18:19:13.445560 4927 caffe.cpp:317] Batch 27, loss = 0.0210676 I0112 18:19:13.452410 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:13.452630 4928 data_layer.cpp:129] Read time: 0.643 ms. I0112 18:19:13.452678 4928 data_layer.cpp:130] Transform time: 5.299 ms. I0112 18:19:13.944241 4927 caffe.cpp:317] Batch 28, accuracy = 0.99 I0112 18:19:13.944494 4927 caffe.cpp:317] Batch 28, loss = 0.0455608 I0112 18:19:13.951339 4928 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:19:13.951529 4928 data_layer.cpp:129] Read time: 0.63 ms. I0112 18:19:13.951577 4928 data_layer.cpp:130] Transform time: 5.203 ms. I0112 18:19:14.443364 4927 caffe.cpp:317] Batch 29, accuracy = 0.97 I0112 18:19:14.443640 4927 caffe.cpp:317] Batch 29, loss = 0.151275 I0112 18:19:14.443702 4927 caffe.cpp:322] Loss: 0.0486609 I0112 18:19:14.443778 4927 caffe.cpp:334] accuracy = 0.985667 I0112 18:19:14.443860 4927 caffe.cpp:334] loss = 0.0486609 (* 1 = 0.0486609 loss) 0m15.26s real 0m15.44s user 0m0.05s system

haolongzhangm commented 6 years ago

mtk_open_acl.log ====================== WARNING: Logging before InitGoogleLogging() is written to STDERR I0112 18:21:03.084941 4939 caffe.cpp:288] Use CPU. I0112 18:21:03.168365 4939 net.cpp:296] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist I0112 18:21:03.168673 4939 net.cpp:53] Initializing net from parameters: name: "LeNet" state { phase: TEST level: 0 stage: "" } layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "/data/caffe_eg/mnist/mnist_test_lmdb" batch_size: 100 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1" } layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss" } I0112 18:21:03.172957 4939 layer_factory.hpp:77] Creating layer mnist I0112 18:21:03.173509 4939 db_lmdb.cpp:35] Opened lmdb /data/caffe_eg/mnist/mnist_test_lmdb I0112 18:21:03.173676 4939 net.cpp:86] Creating Layer mnist I0112 18:21:03.173784 4939 net.cpp:382] mnist -> data I0112 18:21:03.173940 4939 net.cpp:382] mnist -> label I0112 18:21:03.174170 4939 data_layer.cpp:45] output data size: 100,1,28,28 I0112 18:21:03.175276 4939 base_data_layer.cpp:72] Initializing prefetch I0112 18:21:03.176249 4939 base_data_layer.cpp:75] Prefetch initialized. I0112 18:21:03.176307 4939 net.cpp:124] Setting up mnist I0112 18:21:03.176363 4939 net.cpp:131] Top shape: 100 1 28 28 (78400) I0112 18:21:03.176442 4939 net.cpp:131] Top shape: 100 (100) I0112 18:21:03.176482 4939 net.cpp:139] Memory required for data: 314000 I0112 18:21:03.176545 4939 layer_factory.hpp:77] Creating layer label_mnist_1_split I0112 18:21:03.176658 4939 net.cpp:86] Creating Layer label_mnist_1_split I0112 18:21:03.176712 4939 net.cpp:408] label_mnist_1_split <- label I0112 18:21:03.176803 4939 net.cpp:382] label_mnist_1_split -> label_mnist_1_split_0 I0112 18:21:03.176892 4939 net.cpp:382] label_mnist_1_split -> label_mnist_1_split_1 I0112 18:21:03.176998 4939 net.cpp:124] Setting up label_mnist_1_split I0112 18:21:03.177047 4939 net.cpp:131] Top shape: 100 (100) I0112 18:21:03.177098 4939 net.cpp:131] Top shape: 100 (100) I0112 18:21:03.177137 4939 net.cpp:139] Memory required for data: 314800 I0112 18:21:03.177180 4939 layer_factory.hpp:77] Creating layer conv1 I0112 18:21:03.177312 4939 net.cpp:86] Creating Layer conv1 I0112 18:21:03.177353 4939 net.cpp:408] conv1 <- data I0112 18:21:03.177441 4939 net.cpp:382] conv1 -> conv1 I0112 18:21:03.177877 4939 net.cpp:124] Setting up conv1 I0112 18:21:03.177945 4939 net.cpp:131] Top shape: 100 20 24 24 (1152000) I0112 18:21:03.177994 4939 net.cpp:139] Memory required for data: 4922800 I0112 18:21:03.178120 4939 layer_factory.hpp:77] Creating layer pool1 I0112 18:21:03.178225 4939 net.cpp:86] Creating Layer pool1 I0112 18:21:03.178276 4939 net.cpp:408] pool1 <- conv1 I0112 18:21:03.178340 4939 net.cpp:382] pool1 -> pool1 I0112 18:21:03.178478 4939 net.cpp:124] Setting up pool1 I0112 18:21:03.178521 4939 net.cpp:131] Top shape: 100 20 12 12 (288000) I0112 18:21:03.178563 4939 net.cpp:139] Memory required for data: 6074800 I0112 18:21:03.178606 4939 layer_factory.hpp:77] Creating layer conv2 I0112 18:21:03.178714 4939 net.cpp:86] Creating Layer conv2 I0112 18:21:03.178757 4939 net.cpp:408] conv2 <- pool1 I0112 18:21:03.178824 4939 net.cpp:382] conv2 -> conv2 I0112 18:21:03.184686 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:03.184856 4955 data_layer.cpp:129] Read time: 0.668 ms. I0112 18:21:03.184902 4955 data_layer.cpp:130] Transform time: 5.281 ms. I0112 18:21:03.186022 4939 net.cpp:124] Setting up conv2 I0112 18:21:03.186111 4939 net.cpp:131] Top shape: 100 50 8 8 (320000) I0112 18:21:03.186163 4939 net.cpp:139] Memory required for data: 7354800 I0112 18:21:03.186282 4939 layer_factory.hpp:77] Creating layer pool2 I0112 18:21:03.186383 4939 net.cpp:86] Creating Layer pool2 I0112 18:21:03.186434 4939 net.cpp:408] pool2 <- conv2 I0112 18:21:03.186506 4939 net.cpp:382] pool2 -> pool2 I0112 18:21:03.186645 4939 net.cpp:124] Setting up pool2 I0112 18:21:03.186691 4939 net.cpp:131] Top shape: 100 50 4 4 (80000) I0112 18:21:03.186730 4939 net.cpp:139] Memory required for data: 7674800 I0112 18:21:03.186770 4939 layer_factory.hpp:77] Creating layer ip1 I0112 18:21:03.186858 4939 net.cpp:86] Creating Layer ip1 I0112 18:21:03.186902 4939 net.cpp:408] ip1 <- pool2 I0112 18:21:03.186993 4939 net.cpp:382] ip1 -> ip1 I0112 18:21:03.191673 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:03.191824 4955 data_layer.cpp:129] Read time: 0.646 ms. I0112 18:21:03.191906 4955 data_layer.cpp:130] Transform time: 5.255 ms. I0112 18:21:03.198546 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:03.198689 4955 data_layer.cpp:129] Read time: 0.625 ms. I0112 18:21:03.198783 4955 data_layer.cpp:130] Transform time: 5.175 ms. I0112 18:21:03.205486 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:03.205641 4955 data_layer.cpp:129] Read time: 0.647 ms. I0112 18:21:03.205684 4955 data_layer.cpp:130] Transform time: 5.238 ms. I0112 18:21:03.293665 4939 net.cpp:124] Setting up ip1 I0112 18:21:03.293856 4939 net.cpp:131] Top shape: 100 500 (50000) I0112 18:21:03.293913 4939 net.cpp:139] Memory required for data: 7874800 I0112 18:21:03.294035 4939 layer_factory.hpp:77] Creating layer relu1 I0112 18:21:03.294142 4939 net.cpp:86] Creating Layer relu1 I0112 18:21:03.294204 4939 net.cpp:408] relu1 <- ip1 I0112 18:21:03.294270 4939 net.cpp:369] relu1 -> ip1 (in-place) I0112 18:21:03.294345 4939 net.cpp:124] Setting up relu1 I0112 18:21:03.294383 4939 net.cpp:131] Top shape: 100 500 (50000) I0112 18:21:03.294422 4939 net.cpp:139] Memory required for data: 8074800 I0112 18:21:03.294461 4939 layer_factory.hpp:77] Creating layer ip2 I0112 18:21:03.294545 4939 net.cpp:86] Creating Layer ip2 I0112 18:21:03.294636 4939 net.cpp:408] ip2 <- ip1 I0112 18:21:03.294733 4939 net.cpp:382] ip2 -> ip2 I0112 18:21:03.296279 4939 net.cpp:124] Setting up ip2 I0112 18:21:03.296366 4939 net.cpp:131] Top shape: 100 10 (1000) I0112 18:21:03.296411 4939 net.cpp:139] Memory required for data: 8078800 I0112 18:21:03.296516 4939 layer_factory.hpp:77] Creating layer ip2_ip2_0_split I0112 18:21:03.296587 4939 net.cpp:86] Creating Layer ip2_ip2_0_split I0112 18:21:03.296633 4939 net.cpp:408] ip2_ip2_0_split <- ip2 I0112 18:21:03.296696 4939 net.cpp:382] ip2_ip2_0_split -> ip2_ip2_0_split_0 I0112 18:21:03.296766 4939 net.cpp:382] ip2_ip2_0_split -> ip2_ip2_0_split_1 I0112 18:21:03.296836 4939 net.cpp:124] Setting up ip2_ip2_0_split I0112 18:21:03.296880 4939 net.cpp:131] Top shape: 100 10 (1000) I0112 18:21:03.296924 4939 net.cpp:131] Top shape: 100 10 (1000) I0112 18:21:03.296967 4939 net.cpp:139] Memory required for data: 8086800 I0112 18:21:03.297004 4939 layer_factory.hpp:77] Creating layer accuracy I0112 18:21:03.297101 4939 net.cpp:86] Creating Layer accuracy I0112 18:21:03.297140 4939 net.cpp:408] accuracy <- ip2_ip2_0_split_0 I0112 18:21:03.297191 4939 net.cpp:408] accuracy <- label_mnist_1_split_0 I0112 18:21:03.297245 4939 net.cpp:382] accuracy -> accuracy I0112 18:21:03.297317 4939 net.cpp:124] Setting up accuracy I0112 18:21:03.297360 4939 net.cpp:131] Top shape: (1) I0112 18:21:03.297404 4939 net.cpp:139] Memory required for data: 8086804 I0112 18:21:03.297446 4939 layer_factory.hpp:77] Creating layer loss I0112 18:21:03.297502 4939 net.cpp:86] Creating Layer loss I0112 18:21:03.297545 4939 net.cpp:408] loss <- ip2_ip2_0_split_1 I0112 18:21:03.297596 4939 net.cpp:408] loss <- label_mnist_1_split_1 I0112 18:21:03.297657 4939 net.cpp:382] loss -> loss I0112 18:21:03.297744 4939 layer_factory.hpp:77] Creating layer loss I0112 18:21:03.297914 4939 net.cpp:124] Setting up loss I0112 18:21:03.297960 4939 net.cpp:131] Top shape: (1) I0112 18:21:03.298003 4939 net.cpp:134] with loss weight 1 I0112 18:21:03.298074 4939 net.cpp:139] Memory required for data: 8086808 I0112 18:21:03.298121 4939 net.cpp:200] loss needs backward computation. I0112 18:21:03.298168 4939 net.cpp:202] accuracy does not need backward computation. I0112 18:21:03.298210 4939 net.cpp:200] ip2_ip2_0_split needs backward computation. I0112 18:21:03.298261 4939 net.cpp:200] ip2 needs backward computation. I0112 18:21:03.298301 4939 net.cpp:200] relu1 needs backward computation. I0112 18:21:03.298341 4939 net.cpp:200] ip1 needs backward computation. I0112 18:21:03.298380 4939 net.cpp:200] pool2 needs backward computation. I0112 18:21:03.298420 4939 net.cpp:200] conv2 needs backward computation. I0112 18:21:03.298464 4939 net.cpp:200] pool1 needs backward computation. I0112 18:21:03.298506 4939 net.cpp:200] conv1 needs backward computation. I0112 18:21:03.298550 4939 net.cpp:202] label_mnist_1_split does not need backward computation. I0112 18:21:03.298594 4939 net.cpp:202] mnist does not need backward computation. I0112 18:21:03.298630 4939 net.cpp:244] This network produces output accuracy I0112 18:21:03.298676 4939 net.cpp:244] This network produces output loss I0112 18:21:03.298787 4939 net.cpp:257] Network initialization done. I0112 18:21:03.343601 4939 net.cpp:749] Copying source layer mnist I0112 18:21:03.343785 4939 net.cpp:749] Copying source layer conv1 I0112 18:21:03.343896 4939 net.cpp:749] Copying source layer pool1 I0112 18:21:03.343936 4939 net.cpp:749] Copying source layer conv2 I0112 18:21:03.345243 4939 net.cpp:749] Copying source layer pool2 I0112 18:21:03.345346 4939 net.cpp:749] Copying source layer ip1 I0112 18:21:03.364862 4939 net.cpp:749] Copying source layer relu1 I0112 18:21:03.365064 4939 net.cpp:749] Copying source layer ip2 I0112 18:21:03.365422 4939 net.cpp:749] Copying source layer loss I0112 18:21:03.368325 4939 caffe.cpp:294] Running for 30 iterations. I0112 18:21:03.991498 4939 caffe.cpp:317] Batch 0, accuracy = 0.11 I0112 18:21:03.991700 4939 caffe.cpp:317] Batch 0, loss = 2.30272 I0112 18:21:03.999531 4955 data_layer.cpp:128] Prefetch batch: 7 ms. I0112 18:21:04.000589 4955 data_layer.cpp:129] Read time: 0.742 ms. I0112 18:21:04.000680 4955 data_layer.cpp:130] Transform time: 5.978 ms. I0112 18:21:04.656731 4939 caffe.cpp:317] Batch 1, accuracy = 0.1 I0112 18:21:04.656937 4939 caffe.cpp:317] Batch 1, loss = 2.30298 I0112 18:21:04.683619 4955 data_layer.cpp:128] Prefetch batch: 26 ms. I0112 18:21:04.684185 4955 data_layer.cpp:129] Read time: 1.041 ms. I0112 18:21:04.684267 4955 data_layer.cpp:130] Transform time: 24.223 ms. I0112 18:21:05.298616 4939 caffe.cpp:317] Batch 2, accuracy = 0.13 I0112 18:21:05.298820 4939 caffe.cpp:317] Batch 2, loss = 2.30277 I0112 18:21:05.307432 4955 data_layer.cpp:128] Prefetch batch: 8 ms. I0112 18:21:05.307773 4955 data_layer.cpp:129] Read time: 0.819 ms. I0112 18:21:05.307845 4955 data_layer.cpp:130] Transform time: 6.679 ms. I0112 18:21:06.068819 4939 caffe.cpp:317] Batch 3, accuracy = 0.07 I0112 18:21:06.069010 4939 caffe.cpp:317] Batch 3, loss = 2.3027 I0112 18:21:06.076590 4955 data_layer.cpp:128] Prefetch batch: 7 ms. I0112 18:21:06.076782 4955 data_layer.cpp:129] Read time: 0.764 ms. I0112 18:21:06.076832 4955 data_layer.cpp:130] Transform time: 5.796 ms. I0112 18:21:06.706459 4939 caffe.cpp:317] Batch 4, accuracy = 0.13 I0112 18:21:06.706658 4939 caffe.cpp:317] Batch 4, loss = 2.3028 I0112 18:21:06.714949 4955 data_layer.cpp:128] Prefetch batch: 8 ms. I0112 18:21:06.715394 4955 data_layer.cpp:129] Read time: 0.776 ms. I0112 18:21:06.715461 4955 data_layer.cpp:130] Transform time: 6.29 ms. I0112 18:21:07.454897 4939 caffe.cpp:317] Batch 5, accuracy = 0.1 I0112 18:21:07.455111 4939 caffe.cpp:317] Batch 5, loss = 2.30277 I0112 18:21:07.465002 4955 data_layer.cpp:128] Prefetch batch: 9 ms. I0112 18:21:07.465601 4955 data_layer.cpp:129] Read time: 1.033 ms. I0112 18:21:07.465689 4955 data_layer.cpp:130] Transform time: 7.558 ms. I0112 18:21:08.191756 4939 caffe.cpp:317] Batch 6, accuracy = 0.1 I0112 18:21:08.191975 4939 caffe.cpp:317] Batch 6, loss = 2.30298 I0112 18:21:08.207046 4955 data_layer.cpp:128] Prefetch batch: 14 ms. I0112 18:21:08.222504 4955 data_layer.cpp:129] Read time: 1.081 ms. I0112 18:21:08.222916 4955 data_layer.cpp:130] Transform time: 12.428 ms. I0112 18:21:08.888784 4939 caffe.cpp:317] Batch 7, accuracy = 0.06 I0112 18:21:08.888995 4939 caffe.cpp:317] Batch 7, loss = 2.30239 I0112 18:21:08.895900 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:08.896087 4955 data_layer.cpp:129] Read time: 0.685 ms. I0112 18:21:08.896133 4955 data_layer.cpp:130] Transform time: 5.234 ms. I0112 18:21:09.563263 4939 caffe.cpp:317] Batch 8, accuracy = 0.07 I0112 18:21:09.563482 4939 caffe.cpp:317] Batch 8, loss = 2.30192 I0112 18:21:09.571055 4955 data_layer.cpp:128] Prefetch batch: 7 ms. I0112 18:21:09.571507 4955 data_layer.cpp:129] Read time: 0.698 ms. I0112 18:21:09.571594 4955 data_layer.cpp:130] Transform time: 5.899 ms. I0112 18:21:10.166699 4939 caffe.cpp:317] Batch 9, accuracy = 0.08 I0112 18:21:10.166920 4939 caffe.cpp:317] Batch 9, loss = 2.30239 I0112 18:21:10.195009 4955 data_layer.cpp:128] Prefetch batch: 27 ms. I0112 18:21:10.195531 4955 data_layer.cpp:129] Read time: 0.988 ms. I0112 18:21:10.195607 4955 data_layer.cpp:130] Transform time: 21.669 ms. I0112 18:21:10.862153 4939 caffe.cpp:317] Batch 10, accuracy = 0.1 I0112 18:21:10.862344 4939 caffe.cpp:317] Batch 10, loss = 2.30258 I0112 18:21:10.870364 4955 data_layer.cpp:128] Prefetch batch: 7 ms. I0112 18:21:10.870573 4955 data_layer.cpp:129] Read time: 0.772 ms. I0112 18:21:10.870626 4955 data_layer.cpp:130] Transform time: 6.116 ms. I0112 18:21:11.482462 4939 caffe.cpp:317] Batch 11, accuracy = 0.08 I0112 18:21:11.482668 4939 caffe.cpp:317] Batch 11, loss = 2.30272 I0112 18:21:11.493033 4955 data_layer.cpp:128] Prefetch batch: 10 ms. I0112 18:21:11.493221 4955 data_layer.cpp:129] Read time: 0.69 ms. I0112 18:21:11.493271 4955 data_layer.cpp:130] Transform time: 5.18 ms. I0112 18:21:12.040501 4939 caffe.cpp:317] Batch 12, accuracy = 0.08 I0112 18:21:12.040684 4939 caffe.cpp:317] Batch 12, loss = 2.30192 I0112 18:21:12.047759 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:12.047948 4955 data_layer.cpp:129] Read time: 0.687 ms. I0112 18:21:12.047999 4955 data_layer.cpp:130] Transform time: 5.362 ms. I0112 18:21:12.656080 4939 caffe.cpp:317] Batch 13, accuracy = 0.1 I0112 18:21:12.656268 4939 caffe.cpp:317] Batch 13, loss = 2.3027 I0112 18:21:12.665878 4955 data_layer.cpp:128] Prefetch batch: 9 ms. I0112 18:21:12.675768 4955 data_layer.cpp:129] Read time: 0.702 ms. I0112 18:21:12.675890 4955 data_layer.cpp:130] Transform time: 7.881 ms. I0112 18:21:13.254046 4939 caffe.cpp:317] Batch 14, accuracy = 0.14 I0112 18:21:13.254243 4939 caffe.cpp:317] Batch 14, loss = 2.30298 I0112 18:21:13.274111 4955 data_layer.cpp:128] Prefetch batch: 19 ms. I0112 18:21:13.274307 4955 data_layer.cpp:129] Read time: 0.752 ms. I0112 18:21:13.274359 4955 data_layer.cpp:130] Transform time: 5.89 ms. I0112 18:21:13.991847 4939 caffe.cpp:317] Batch 15, accuracy = 0.11 I0112 18:21:13.992028 4939 caffe.cpp:317] Batch 15, loss = 2.30272 I0112 18:21:13.998821 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:14.001773 4955 data_layer.cpp:129] Read time: 0.656 ms. I0112 18:21:14.013264 4955 data_layer.cpp:130] Transform time: 5.163 ms. I0112 18:21:14.627467 4939 caffe.cpp:317] Batch 16, accuracy = 0.12 I0112 18:21:14.627679 4939 caffe.cpp:317] Batch 16, loss = 2.30277 I0112 18:21:14.634605 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:14.634816 4955 data_layer.cpp:129] Read time: 0.763 ms. I0112 18:21:14.634865 4955 data_layer.cpp:130] Transform time: 5.192 ms. I0112 18:21:15.336454 4939 caffe.cpp:317] Batch 17, accuracy = 0.09 I0112 18:21:15.336684 4939 caffe.cpp:317] Batch 17, loss = 2.30263 I0112 18:21:15.364289 4955 data_layer.cpp:128] Prefetch batch: 27 ms. I0112 18:21:15.367653 4955 data_layer.cpp:129] Read time: 1.084 ms. I0112 18:21:15.367790 4955 data_layer.cpp:130] Transform time: 24.891 ms. I0112 18:21:15.972120 4939 caffe.cpp:317] Batch 18, accuracy = 0.1 I0112 18:21:15.972316 4939 caffe.cpp:317] Batch 18, loss = 2.30298 I0112 18:21:15.979261 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:15.979476 4955 data_layer.cpp:129] Read time: 0.689 ms. I0112 18:21:15.979526 4955 data_layer.cpp:130] Transform time: 5.273 ms. I0112 18:21:16.715196 4939 caffe.cpp:317] Batch 19, accuracy = 0.08 I0112 18:21:16.715435 4939 caffe.cpp:317] Batch 19, loss = 2.30239 I0112 18:21:16.728497 4955 data_layer.cpp:128] Prefetch batch: 12 ms. I0112 18:21:16.729202 4955 data_layer.cpp:129] Read time: 1.376 ms. I0112 18:21:16.729313 4955 data_layer.cpp:130] Transform time: 9.945 ms. I0112 18:21:17.464715 4939 caffe.cpp:317] Batch 20, accuracy = 0.08 I0112 18:21:17.464930 4939 caffe.cpp:317] Batch 20, loss = 2.30298 I0112 18:21:17.478517 4955 data_layer.cpp:128] Prefetch batch: 13 ms. I0112 18:21:17.495019 4955 data_layer.cpp:129] Read time: 0.975 ms. I0112 18:21:17.495159 4955 data_layer.cpp:130] Transform time: 11.185 ms. I0112 18:21:18.143924 4939 caffe.cpp:317] Batch 21, accuracy = 0.06 I0112 18:21:18.144124 4939 caffe.cpp:317] Batch 21, loss = 2.30238 I0112 18:21:18.151165 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:18.151368 4955 data_layer.cpp:129] Read time: 0.703 ms. I0112 18:21:18.151417 4955 data_layer.cpp:130] Transform time: 5.361 ms. I0112 18:21:18.796382 4939 caffe.cpp:317] Batch 22, accuracy = 0.12 I0112 18:21:18.796591 4939 caffe.cpp:317] Batch 22, loss = 2.3028 I0112 18:21:18.803570 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:18.803819 4955 data_layer.cpp:129] Read time: 0.704 ms. I0112 18:21:18.803884 4955 data_layer.cpp:130] Transform time: 5.277 ms. I0112 18:21:19.395406 4939 caffe.cpp:317] Batch 23, accuracy = 0.1 I0112 18:21:19.395613 4939 caffe.cpp:317] Batch 23, loss = 2.30277 I0112 18:21:19.408810 4955 data_layer.cpp:128] Prefetch batch: 12 ms. I0112 18:21:19.418444 4955 data_layer.cpp:129] Read time: 0.9 ms. I0112 18:21:19.418576 4955 data_layer.cpp:130] Transform time: 10.966 ms. I0112 18:21:20.104066 4939 caffe.cpp:317] Batch 24, accuracy = 0.14 I0112 18:21:20.104279 4939 caffe.cpp:317] Batch 24, loss = 2.30238 I0112 18:21:20.116556 4955 data_layer.cpp:128] Prefetch batch: 12 ms. I0112 18:21:20.130146 4955 data_layer.cpp:129] Read time: 0.898 ms. I0112 18:21:20.130306 4955 data_layer.cpp:130] Transform time: 10.063 ms. I0112 18:21:20.777202 4939 caffe.cpp:317] Batch 25, accuracy = 0.09 I0112 18:21:20.777466 4939 caffe.cpp:317] Batch 25, loss = 2.3028 I0112 18:21:20.788323 4955 data_layer.cpp:128] Prefetch batch: 10 ms. I0112 18:21:20.788925 4955 data_layer.cpp:129] Read time: 0.949 ms. I0112 18:21:20.789021 4955 data_layer.cpp:130] Transform time: 8.491 ms. I0112 18:21:21.450106 4939 caffe.cpp:317] Batch 26, accuracy = 0.1 I0112 18:21:21.450218 4939 caffe.cpp:317] Batch 26, loss = 2.30192 I0112 18:21:21.457077 4955 data_layer.cpp:128] Prefetch batch: 6 ms. I0112 18:21:21.457684 4955 data_layer.cpp:129] Read time: 0.71 ms. I0112 18:21:21.457792 4955 data_layer.cpp:130] Transform time: 5.218 ms. I0112 18:21:22.026882 4939 caffe.cpp:317] Batch 27, accuracy = 0.09 I0112 18:21:22.027078 4939 caffe.cpp:317] Batch 27, loss = 2.30272 I0112 18:21:22.038611 4955 data_layer.cpp:128] Prefetch batch: 11 ms. I0112 18:21:22.049075 4955 data_layer.cpp:129] Read time: 1.194 ms. I0112 18:21:22.049204 4955 data_layer.cpp:130] Transform time: 9.116 ms. I0112 18:21:22.654046 4939 caffe.cpp:317] Batch 28, accuracy = 0.11 I0112 18:21:22.654257 4939 caffe.cpp:317] Batch 28, loss = 2.30192 I0112 18:21:22.673635 4955 data_layer.cpp:128] Prefetch batch: 19 ms. I0112 18:21:22.678758 4955 data_layer.cpp:129] Read time: 0.882 ms. I0112 18:21:22.678890 4955 data_layer.cpp:130] Transform time: 17.162 ms. I0112 18:21:23.337829 4939 caffe.cpp:317] Batch 29, accuracy = 0.14 I0112 18:21:23.338044 4939 caffe.cpp:317] Batch 29, loss = 2.3027 I0112 18:21:23.338109 4939 caffe.cpp:322] Loss: 2.30261 I0112 18:21:23.338199 4939 caffe.cpp:334] accuracy = 0.0993333 I0112 18:21:23.338338 4939 caffe.cpp:334] loss = 2.30261 (* 1 = 2.30261 loss) 0m20.48s real 0m20.73s user 0m12.44s system

daeinki commented 6 years ago

-DCPU_ONLY=ON Seems you used CaffeOnACL with only CPU even though you compiled ACL with OpenCL support. Please comment above line if you want to use GPU.

Ps. you can check whether GPU worked or not with below command before and after running the test app,

cat /proc/interrupt | grep mali

And check whether interrupt count is increased or not. If you used OpenCL correctly then the interrupt count must be increased.

For reference, in some computation cases GPU was slower than CPU(OpenBLAS) and in some case GPU was fater than CPU in my case. I guess you would need to except a first time measurement result because CL kernel is compiled in runtime first one time, which incurs some overhead.

However, I guess if CaffeOnACL supports NNPACK - Caffe2 supports only it - in almost cases CPU would be faster than GPU. Of course this would depend on GPU power.

The benefit of CaffeOnACL I think would be that it can have combinated pathes - OpenBLAS + ACL GPU or OpenBLAS + ACL neon by bypassing ACL or not - for forward computations such as convolution, activation funtions and pooling according to your Hardware performance.

Thanks, Inki Dae

baynaa7 commented 6 years ago

Hello I am testing caffeACL vs caffe on tx2 board. however classification example on alexnet gives following result. arguments are exactly same given by kaishijeng in https://github.com/OAID/Caffe-HRT/issues/2. caffeACL: elapsed time: [2.28925] seconds caffe: elapsed time: [1.2105] seconds

note both running on cpu version

any possible hypothesis for these results? Thanks in advance.