intel / caffe

This fork of BVLC/Caffe is dedicated to improving performance of this deep learning framework when running on CPU, in particular Intel® Xeon processors.
Other
849 stars 491 forks source link

multinode training in docker on multiple machines stuck #199

Closed xzhangxa closed 6 years ago

xzhangxa commented 6 years ago

Hi, I tried to run multinode training on multiple machines in docker following this wiki. But on master docker the run_intelcaffe.sh script stuck after the initial output. On other machines in each docker client container there's one caffe process with 100% cpu usage only. The training process then was stuck there, no error is reported and no more output on master. But if there's only 1 client the training is fine.

Following this wiki I was using lastest bvlc/caffe:intel_multinode image, which is Intel caffe 1.1.0. bvlc/caffe:intel doesn't work too which is 1.1.1a. ssh no password access is done and no firewall.

Below is all the output on master container, hosts file contains 2 clients, then the script was stuck, no error, no more output, while all clients had caffe process running 100% cpu.

Could you help point out what may be wrong or how could I debug this ? Thanks in advance.

root@jfz1r04h16:/opt/caffe# ./scripts/run_intelcaffe.sh --hostfile hosts --solver examples/mnist/lenet_solver.prototxt --network tcp --netmask enp134s0f0

CPUs with optimal settings:
    Intel Xeon E7-88/48xx, E5-46/26/16xx, E3-12xx, D15/D-15 (Broadwell)
    Intel Xeon Phi 7210/30/50/90 (Knights Landing)
    Intel Xeon Platinum 81/61/51/41/31xx (Skylake)

Settings:
    CPU: skx
    Host file: hosts
    Running mode: train
    Benchmark: none
    Debug option: off
    Engine:
    Number of MLSL servers: -1
        -1: selected automatically according to CPU model.
            BDW/SKX: 2, KNL: 4
    Solver file: examples/mnist/lenet_solver.prototxt
    LMDB data source: examples/mnist/mnist_train_lmdb
    LMDB data source: examples/mnist/mnist_test_lmdb
    Network: tcp
    Netmask for TCP network: enp134s0f0
    NUMA configuration: Flat mode.
Create result directory: /opt/caffe/result-20180425053457
    Number of nodes: 2
MLSL_NUM_SERVERS: 2
MLSL_SERVER_AFFINITY: 6,7
Pin internal threads to: 70,71
Reserve number of cores: 30
Number of OpenMP threads: 4
Run caffe with 2 nodes...
Warning: cannot find sensors
[0] [0] MPI startup(): Intel(R) MPI Library, Version 2018 Update 1  Build 20171011 (id: 17941)
[0] [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] [0] MPI startup(): Multi-threaded optimized library
[0] [0] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h19
[0] [0] MPI startup(): tcp data transfer mode
[1] [1] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h18
[1] [1] MPI startup(): tcp data transfer mode
[0] [0] MPI startup(): Device_reset_idx=5
[0] [0] MPI startup(): Allgather: 4: 27306-38912 & 0-2
[0] [0] MPI startup(): Allgather: 4: 78064-294912 & 0-2
[0] [0] MPI startup(): Allgather: 3: 0-27306 & 0-2
[0] [0] MPI startup(): Allgather: 3: 38912-78064 & 0-2
[0] [0] MPI startup(): Allgather: 3: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allgather: 1: 0-7 & 3-4
[0] [0] MPI startup(): Allgather: 1: 9-4607 & 3-4
[0] [0] MPI startup(): Allgather: 1: 66622-461338 & 3-4
[0] [0] MPI startup(): Allgather: 3: 9081-26350 & 3-4
[0] [0] MPI startup(): Allgather: 3: 461338-2692119 & 3-4
[0] [0] MPI startup(): Allgather: 4: 7-9 & 3-4
[0] [0] MPI startup(): Allgather: 4: 4607-9081 & 3-4
[0] [0] MPI startup(): Allgather: 4: 26350-66622 & 3-4
[0] [0] MPI startup(): Allgather: 4: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allgather: 2: 1-1 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 2-3 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 4-5 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 6-26 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 27-98 & 5-2147483647
[0] [0] MPI startup(): Allgather: 3: 99-1029 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 1030-5572 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 5573-15186 & 5-2147483647
[0] [0] MPI startup(): Allgather: 2: 15187-33976 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 33977-74391 & 5-2147483647
[0] [0] MPI startup(): Allgather: 3: 74392-131842 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allgatherv: 1: 0-2 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 2-7 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 7-49 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 49-113 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 113-149 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 149-915 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 915-1614 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 1614-3296 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 3296-5670 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 5670-10998 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 10998-185966 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 185966-381166 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 381166-1597083 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 1597083-2998114 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 0-47 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 1: 47-103 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 103-438 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 438-757 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 757-1453 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 1453-3133 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 3133-6762 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 6762-10802 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 10802-49917 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 49917-309996 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 309996-3739157 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Allreduce: 1: 804-1535 & 0-2
[0] [0] MPI startup(): Allreduce: 1: 2061-17116 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 17116-37171 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 344562-1048576 & 0-2
[0] [0] MPI startup(): Allreduce: 3: 37171-344562 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 0-804 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 1535-2061 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 1048576-3026207 & 0-2
[0] [0] MPI startup(): Allreduce: 4: 3026207-8388608 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 8388609-8635416 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 0-6 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 6-11 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 11-49 & 3-4
[0] [0] MPI startup(): Allreduce: 6: 49-321 & 3-4
[0] [0] MPI startup(): Allreduce: 2: 321-720 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 720-1375 & 3-4
[0] [0] MPI startup(): Allreduce: 1: 1375-173904 & 3-4
[0] [0] MPI startup(): Allreduce: 2: 173904-318383 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 318383-1512039 & 3-4
[0] [0] MPI startup(): Allreduce: 6: 1512039-2561761 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 2561762-8388608 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 8388609-10618873 & 3-4
[0] [0] MPI startup(): Allreduce: 8: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allreduce: 1: 0-11 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 11-24 & 5-8
[0] [0] MPI startup(): Allreduce: 6: 24-42 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 42-107 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 107-178 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 178-310 & 5-8
[0] [0] MPI startup(): Allreduce: 2: 310-594 & 5-8
[0] [0] MPI startup(): Allreduce: 5: 594-4431 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 4431-54874 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 54874-91696 & 5-8
[0] [0] MPI startup(): Allreduce: 6: 91696-175538 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 175538-383770 & 5-8
[0] [0] MPI startup(): Allreduce: 2: 383770-684262 & 5-8
[0] [0] MPI startup(): Allreduce: 3: 0-2147483647 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 0-11 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 11-24 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 6: 24-42 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 42-107 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 107-178 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 178-310 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 2: 310-594 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 5: 594-4431 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 4431-54874 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 54874-91696 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 6: 91696-175538 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 175538-383770 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 2: 383770-32006608 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 3: 0-2147483647 & 9-2147483647
[0] [0] MPI startup(): Alltoall: 3: 0-129493 & 0-2
[0] [0] MPI startup(): Alltoall: 3: 1080889-3453431 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 129493-1080889 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 0-2147483647 & 3-4
[0] [0] MPI startup(): Alltoall: 1: 1-64 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 2: 65-572235 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 4: 572236-1736997 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 3: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Alltoallv: 2: 0-2147483647 & 3-4
[0] [0] MPI startup(): Alltoallv: 2: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Barrier: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Barrier: 6: 0-2147483647 & 3-4
[0] [0] MPI startup(): Barrier: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Bcast: 7: 0-8 & 0-2
[0] [0] MPI startup(): Bcast: 7: 24-64 & 0-2
[0] [0] MPI startup(): Bcast: 7: 11264-52186 & 0-2
[0] [0] MPI startup(): Bcast: 7: 112045-131072 & 0-2
[0] [0] MPI startup(): Bcast: 7: 1048576-2097152 & 0-2
[0] [0] MPI startup(): Bcast: 1: 8-24 & 0-2
[0] [0] MPI startup(): Bcast: 1: 64-11264 & 0-2
[0] [0] MPI startup(): Bcast: 1: 52186-112045 & 0-2
[0] [0] MPI startup(): Bcast: 1: 131072-1048576 & 0-2
[0] [0] MPI startup(): Bcast: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Bcast: 1: 1-1 & 3-4
[0] [0] MPI startup(): Bcast: 5: 2-3 & 3-4
[0] [0] MPI startup(): Bcast: 1: 4-5 & 3-4
[0] [0] MPI startup(): Bcast: 6: 6-11 & 3-4
[0] [0] MPI startup(): Bcast: 5: 12-24 & 3-4
[0] [0] MPI startup(): Bcast: 4: 25-141 & 3-4
[0] [0] MPI startup(): Bcast: 7: 142-370 & 3-4
[0] [0] MPI startup(): Bcast: 3: 371-680 & 3-4
[0] [0] MPI startup(): Bcast: 4: 681-3894 & 3-4
[0] [0] MPI startup(): Bcast: 1: 3895-4494 & 3-4
[0] [0] MPI startup(): Bcast: 7: 4495-14778 & 3-4
[0] [0] MPI startup(): Bcast: 4: 14779-18223 & 3-4
[0] [0] MPI startup(): Bcast: 7: 18224-36738 & 3-4
[0] [0] MPI startup(): Bcast: 3: 0-2147483647 & 3-4
[0] [0] MPI startup(): Bcast: 1: 0-10 & 5-2147483647
[0] [0] MPI startup(): Bcast: 1: 175-16799 & 5-2147483647
[0] [0] MPI startup(): Bcast: 6: 10-32 & 5-2147483647
[0] [0] MPI startup(): Bcast: 6: 32-175 & 5-2147483647
[0] [0] MPI startup(): Bcast: 7: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Gather: 2: 73643-172031 & 0-2
[0] [0] MPI startup(): Gather: 3: 0-853 & 0-2
[0] [0] MPI startup(): Gather: 3: 54613-73643 & 0-2
[0] [0] MPI startup(): Gather: 3: 262144-524288 & 0-2
[0] [0] MPI startup(): Gather: 1: 853-54613 & 0-2
[0] [0] MPI startup(): Gather: 1: 172031-262144 & 0-2
[0] [0] MPI startup(): Gather: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Gather: 2: 34148-129691 & 3-2147483647
[0] [0] MPI startup(): Gather: 2: 503316-2506634 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 0-34148 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 129691-503316 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 0-2147483647 & 3-2147483647
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 5-26 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 26-47 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 47-98 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 98-188 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 188-362 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 2: 362-588 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 588-1951 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 1951-11702 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 11702-23138 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 23138-58229 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 58229-191964 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 2: 191964-2656092 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 0-2147483647 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 4: 0-4 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 4-12 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 12-45 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 45-85 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 85-391 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 391-596 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 2: 596-1927 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 1927-2286 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 2286-7442 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 7442-10726 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 10726-45950 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 45950-101084 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 101084-159597 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 159597-423110 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 2: 423110-578734 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 578734-1329975 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 1329975-4146461 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 0-2147483647 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 0-5 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 5-28 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 28-50 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 50-197 & 5-2147483647
[1] [1] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=4) Fabric(intra=6 inter=6 flags=0x0)
[0] [0] MPI startup(): Reduce_scatter: 1: 197-721 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 2: 721-3207 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 3207-5980 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 5980-11416 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 11416-104215 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 104215-277330 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 277330-630522 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 630522-2659184 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 4-8 & 0-2
[0] [0] MPI startup(): Reduce: 3: 9-29 & 0-2
[0] [0] MPI startup(): Reduce: 2: 30-37 & 0-2
[0] [0] MPI startup(): Reduce: 3: 38-215 & 0-2
[0] [0] MPI startup(): Reduce: 2: 216-315 & 0-2
[0] [0] MPI startup(): Reduce: 5: 316-775 & 0-2
[0] [0] MPI startup(): Reduce: 2: 776-4045 & 0-2
[0] [0] MPI startup(): Reduce: 4: 4-6 & 3-4
[0] [0] MPI startup(): Reduce: 3: 7-11 & 3-4
[0] [0] MPI startup(): Reduce: 6: 12-16 & 3-4
[0] [0] MPI startup(): Reduce: 4: 17-34 & 3-4
[0] [0] MPI startup(): Reduce: 2: 35-99 & 3-4
[0] [0] MPI startup(): Reduce: 4: 100-230 & 3-4
[0] [0] MPI startup(): Reduce: 6: 231-275 & 3-4
[0] [0] MPI startup(): Reduce: 1: 276-1040 & 3-4
[0] [0] MPI startup(): Reduce: 3: 1041-3895 & 3-4
[0] [0] MPI startup(): Reduce: 6: 3896-4326 & 3-4
[0] [0] MPI startup(): Reduce: 3: 4327-10163 & 3-4
[0] [0] MPI startup(): Reduce: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Reduce: 2: 4-26 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 27-39 & 5-2147483647
[0] [0] MPI startup(): Reduce: 2: 40-230 & 5-2147483647
[0] [0] MPI startup(): Reduce: 3: 231-257 & 5-2147483647
[0] [0] MPI startup(): Reduce: 2: 258-718 & 5-2147483647
[0] [0] MPI startup(): Reduce: 3: 719-2436 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 2437-6344 & 5-2147483647
[0] [0] MPI startup(): Reduce: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Scatter: 1: 0-1 & 0-2
[0] [0] MPI startup(): Scatter: 1: 4-12 & 0-2
[0] [0] MPI startup(): Scatter: 1: 19-2048 & 0-2
[0] [0] MPI startup(): Scatter: 3: 2048-85701 & 0-2
[0] [0] MPI startup(): Scatter: 3: 165767-466939 & 0-2
[0] [0] MPI startup(): Scatter: 3: 524288-2336552 & 0-2
[0] [0] MPI startup(): Scatter: 2: 1-4 & 0-2
[0] [0] MPI startup(): Scatter: 2: 12-19 & 0-2
[0] [0] MPI startup(): Scatter: 2: 85701-165767 & 0-2
[0] [0] MPI startup(): Scatter: 2: 466939-524288 & 0-2
[0] [0] MPI startup(): Scatter: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Scatter: 3: 0-1909200 & 3-2147483647
[0] [0] MPI startup(): Scatter: 2: 0-2147483647 & 3-2147483647
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Rank    Pid      Node name   Pin cpu
[0] [0] MPI startup(): 0       380      jfz1r04h19  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
[0]                                   30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
[0]                                   ,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71}
[0] [0] MPI startup(): 1       773      jfz1r04h18  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
[0]                                   30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
[0]                                   ,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71}
[0] [0] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=4) Fabric(intra=6 inter=6 flags=0x0)
[0] [0] MPI startup(): I_MPI_COLL_INTRANODE=pt2pt
[0] [0] MPI startup(): I_MPI_DEBUG=6
[0] [0] MPI startup(): I_MPI_FABRICS=tcp
[0] [0] MPI startup(): I_MPI_FALLBACK=0
[0] [0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=hfi1_0:0,i40iw0:0,i40iw1:0
[0] [0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] [0] MPI startup(): I_MPI_PIN_MAPPING=1:0 0
[0] [0] MPI startup(): I_MPI_TCP_NETMASK=enp134s0f0
[0] [0] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h19
chuanqi129 commented 6 years ago

Hello, could you please try ./scripts/run_intelcaffe.sh --hostfile hosts --solver examples/mnist/lenet_solver_mlsl.prototxt --network tcp --netmask enp134s0f0 ? And what's your CPU model? Could you please try lscpu too?

xzhangxa commented 6 years ago

@chuanqi129 It's same result using lenet_solver_mlsl.prototxt. CPU model is Intel Xeon Gold 6140, full lscpu output is:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                72
On-line CPU(s) list:   0-71
Thread(s) per core:    2
Core(s) per socket:    18
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Stepping:              4
CPU MHz:               1499.941
CPU max MHz:           3700.0000
CPU min MHz:           1000.0000
BogoMIPS:              4600.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
L3 cache:              25344K
NUMA node0 CPU(s):     0-17,36-53
NUMA node1 CPU(s):     18-35,54-71
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req
chuanqi129 commented 6 years ago

@zhang-xin Thanks for quickly reply. And when you create contanier, did you add--shm-size=40G? And could you try docker info and docker inspect test ? the test is the container name

xzhangxa commented 6 years ago

@chuanqi129 I created the container just as the wiki docker run -tid --net host --name test --privileged --shm-size=40G bvlc/caffe:intel_multinode.

I manged to make it work by using --num_mlsl_servers 0 with others kept same, then multinode training seems working. By default it's -1 auto choosing 4 (KNL) or 2 (BDW/SKX), both don't work. Is it just workaround or desired behavior?

chuanqi129 commented 6 years ago

@zhang-xin I don't think it make sense, could you also sent out apart of your workaround log? Can I see the information of docker info and docker inspect test ? By the way, I can't reproduce this issue on skx-8180 and bdw-2699

xzhangxa commented 6 years ago

@chuanqi129 Below is the docker and container info, and logs with/without num_mlsl_servers=0. With it it's fine, without it the process stuck there, on each client there's one caffe process using and only using 100% CPU.

Actually I found this solution on Intel MLSL issue https://github.com/intel/MLSL/issues/9. The problem looked similar so I tried that.

BTW, I need to comment out test_ssh_connection function in scripts/run_intelcaffe.sh otherwise script will ask me for passwd and default 123456 doesn't work, even though public key access is already ok.

docker info output:

[z1r04h17@jfz1r04h17 ~]$ sudo docker info
Containers: 9
 Running: 2
 Paused: 0
 Stopped: 7
Images: 32
Server Version: 1.13.1
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Init Binary: docker-init
containerd version:  (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: N/A (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 seccomp
  WARNING: You're not using the default seccomp profile
  Profile: /etc/docker/seccomp.json
 selinux
Kernel Version: 3.10.0-693.21.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 72
Total Memory: 125.4 GiB
Name: jfz1r04h17
ID: 67OS:IZJP:MT66:56VW:QID5:QY6D:5S3D:EOMO:SS45:WQWA:QFRO:YR4R
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Registries: docker.io (secure)

docker inspect test output:

[z1r04h17@jfz1r04h17 ~]$ sudo docker inspect test
[
    {
        "Id": "9461ce4b05bfe15314ba4addd161c13f3f9dbebc9164773c4b61a75ab306d404",
        "Created": "2018-04-27T10:56:54.827011974Z",
        "Path": "/usr/sbin/sshd",
        "Args": [
            "-D"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 234456,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-04-27T10:56:55.024590949Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:55c45a63f8f640e694ec59ce4fd288ea2fc432432b737abddbecd4b7f17783a2",
        "ResolvConfPath": "/var/lib/docker/containers/9461ce4b05bfe15314ba4addd161c13f3f9dbebc9164773c4b61a75ab306d404/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/9461ce4b05bfe15314ba4addd161c13f3f9dbebc9164773c4b61a75ab306d404/hostname",
        "HostsPath": "/var/lib/docker/containers/9461ce4b05bfe15314ba4addd161c13f3f9dbebc9164773c4b61a75ab306d404/hosts",
        "LogPath": "",
        "Name": "/test",
        "RestartCount": 0,
        "Driver": "overlay2",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "journald",
                "Config": {}
            },
            "NetworkMode": "host",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": true,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "label=disable"
            ],
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 42949672960,
            "Runtime": "docker-runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "overlay2",
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/e58d7b390c291907f8d044884e4d3624bb8366941798d4d880bb0f7d783da35b-init/diff:/var/lib/docker/overlay2/374a56a948bde624f8bc0bf1de9c02452b457b62df8ba7779e5a8b9f4b48bdf0/diff:/var/lib/docker/overlay2/0e3b50fe1d4074f11802f471bfb87e7977ad164143d3d8ee4a43c999b0192e49/diff:/var/lib/docker/overlay2/cf77251f189611522a2d5ac353a62ab11d18d3b9dcd5cda28d618322dac29d6b/diff:/var/lib/docker/overlay2/e56c32c6aafc2f9d6e4e7901ba5b8922fc6e2a119e63df1199100739ea38147a/diff:/var/lib/docker/overlay2/d29aafdf93edccbe9b0910da13de665169f631f5d742bda781459670f2d80839/diff:/var/lib/docker/overlay2/2e464ff8386bccd4119a540dd46acac3af5a986286fec84328ba836254385ef0/diff:/var/lib/docker/overlay2/b1400afe7e3eed6e08211dbfab3f81fd3121c7a2602080110ded7839b1bdb4f6/diff:/var/lib/docker/overlay2/22ac113d7d3c52e7f146fc005358716a426f5049728fda48c87e7af595607e09/diff:/var/lib/docker/overlay2/aad9061b290341688002d72aa678689dcc34f9ed9e42abbc22dfa9a0c9d436c1/diff:/var/lib/docker/overlay2/533f872472f56be375a7bc91bfdc74ba80cb3324125c65e95d2099c548e3b5af/diff:/var/lib/docker/overlay2/b135baa56c5d84a2e39dc2015afae3a542c0f9ad3f58ecefc085817a2d940d29/diff:/var/lib/docker/overlay2/15d9d74d18f920077341c56011beb6e884187c6f191d8f93114c6ca0a91a583e/diff:/var/lib/docker/overlay2/1f0031f1652a52cb3fe5e0dc4256a2ae4ba6e61d5e18780b8e2f8502063e0bf5/diff:/var/lib/docker/overlay2/82718a5e218f6bc1f52a647846acfadfdb557de56786298a2058fefa3e750453/diff:/var/lib/docker/overlay2/5fd590516dc3eb64796753a0d0a83c89b22ffc789d97581229bf7a021ef38ed2/diff:/var/lib/docker/overlay2/9bcbd1a28944867e1b96188d20f5aab382d4e4e723c0c98fbd3ce2c249ffd498/diff",
                "MergedDir": "/var/lib/docker/overlay2/e58d7b390c291907f8d044884e4d3624bb8366941798d4d880bb0f7d783da35b/merged",
                "UpperDir": "/var/lib/docker/overlay2/e58d7b390c291907f8d044884e4d3624bb8366941798d4d880bb0f7d783da35b/diff",
                "WorkDir": "/var/lib/docker/overlay2/e58d7b390c291907f8d044884e4d3624bb8366941798d4d880bb0f7d783da35b/work"
            }
        },
        "Mounts": [],
        "Config": {
            "Hostname": "jfz1r04h17",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "10010/tcp": {}
            },
            "Tty": true,
            "OpenStdin": true,
            "StdinOnce": false,
            "Env": [
                "PATH=/opt/caffe/build/tools:/opt/caffe/python:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "CAFFE_ROOT=/opt/caffe",
                "CLONE_TAG=master",
                "PYCAFFE_ROOT=/opt/caffe/python",
                "PYTHONPATH=/opt/caffe/python:",
                "NOTVISIBLE=in users profile"
            ],
            "Cmd": [
                "/usr/sbin/sshd",
                "-D"
            ],
            "ArgsEscaped": true,
            "Image": "bvlc/caffe:intel_multinode",
            "Volumes": null,
            "WorkingDir": "/opt/caffe",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {}
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "49b1e71478d81172f6661e7980be22da6d5f62618536cbd896d990cbcfd96329",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/default",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "host": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "da2b061e29f0032925da63e696ded0a04895184839683d6afe4175ca37b188c6",
                    "EndpointID": "368d2decbbb98422297af63f3f01acb4c02abbbc06131174fa0119f8c47f9800",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": ""
                }
            }
        }
    }
]

Log when setting --num_mlsl_servers 0

root@jfz1r04h17:/opt/caffe# ./scripts/run_intelcaffe.sh --hostfile hosts --solver examples/mnist/lenet_solver_mlsl.prototxt --network tcp --netmask enp134s0f0 --num_mlsl_servers 0

CPUs with optimal settings:
    Intel Xeon E7-88/48xx, E5-46/26/16xx, E3-12xx, D15/D-15 (Broadwell)
    Intel Xeon Phi 7210/30/50/90 (Knights Landing)
    Intel Xeon Platinum 81/61/51/41/31xx (Skylake)

Settings:
    CPU: skx
    Host file: hosts
    Running mode: train
    Benchmark: none
    Debug option: off
    Engine:
    Number of MLSL servers: 0
        -1: selected automatically according to CPU model.
            BDW/SKX: 2, KNL: 4
    Solver file: examples/mnist/lenet_solver_mlsl.prototxt
    LMDB data source: examples/mnist/mnist_train_lmdb
    LMDB data source: examples/mnist/mnist_test_lmdb
    Network: tcp
    Netmask for TCP network: enp134s0f0
    NUMA configuration: Flat mode.
Create result directory: /opt/caffe/result-20180427110708
    Number of nodes: 2
MLSL_NUM_SERVERS: 0
Pin internal threads to: 70,71
Number of OpenMP threads: 36
Run caffe with 2 nodes...
Warning: cannot find sensors
[0] [0] MPI startup(): Intel(R) MPI Library, Version 2018 Update 1  Build 20171011 (id: 17941)
[0] [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] [0] MPI startup(): Multi-threaded optimized library
[0] [0] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h18
[0] [0] MPI startup(): tcp data transfer mode
[1] [1] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h19
[1] [1] MPI startup(): tcp data transfer mode
[0] [0] MPI startup(): Device_reset_idx=5
[0] [0] MPI startup(): Allgather: 4: 27306-38912 & 0-2
[0] [0] MPI startup(): Allgather: 4: 78064-294912 & 0-2
[0] [0] MPI startup(): Allgather: 3: 0-27306 & 0-2
[0] [0] MPI startup(): Allgather: 3: 38912-78064 & 0-2
[0] [0] MPI startup(): Allgather: 3: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allgather: 1: 0-7 & 3-4
[0] [0] MPI startup(): Allgather: 1: 9-4607 & 3-4
[0] [0] MPI startup(): Allgather: 1: 66622-461338 & 3-4
[0] [0] MPI startup(): Allgather: 3: 9081-26350 & 3-4
[0] [0] MPI startup(): Allgather: 3: 461338-2692119 & 3-4
[0] [0] MPI startup(): Allgather: 4: 7-9 & 3-4
[0] [0] MPI startup(): Allgather: 4: 4607-9081 & 3-4
[0] [0] MPI startup(): Allgather: 4: 26350-66622 & 3-4
[0] [0] MPI startup(): Allgather: 4: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allgather: 2: 1-1 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 2-3 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 4-5 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 6-26 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 27-98 & 5-2147483647
[0] [0] MPI startup(): Allgather: 3: 99-1029 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 1030-5572 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 5573-15186 & 5-2147483647
[0] [0] MPI startup(): Allgather: 2: 15187-33976 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 33977-74391 & 5-2147483647
[0] [0] MPI startup(): Allgather: 3: 74392-131842 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allgatherv: 1: 0-2 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 2-7 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 7-49 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 49-113 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 113-149 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 149-915 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 915-1614 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 1614-3296 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 3296-5670 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 5670-10998 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 10998-185966 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 185966-381166 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 381166-1597083 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 1597083-2998114 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 0-47 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 1: 47-103 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 103-438 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 438-757 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 757-1453 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 1453-3133 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 3133-6762 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 6762-10802 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 10802-49917 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 49917-309996 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 309996-3739157 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Allreduce: 1: 804-1535 & 0-2
[0] [0] MPI startup(): Allreduce: 1: 2061-17116 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 17116-37171 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 344562-1048576 & 0-2
[0] [0] MPI startup(): Allreduce: 3: 37171-344562 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 0-804 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 1535-2061 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 1048576-3026207 & 0-2
[0] [0] MPI startup(): Allreduce: 4: 3026207-8388608 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 8388609-8635416 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 0-6 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 6-11 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 11-49 & 3-4
[0] [0] MPI startup(): Allreduce: 6: 49-321 & 3-4
[0] [0] MPI startup(): Allreduce: 2: 321-720 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 720-1375 & 3-4
[0] [0] MPI startup(): Allreduce: 1: 1375-173904 & 3-4
[0] [0] MPI startup(): Allreduce: 2: 173904-318383 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 318383-1512039 & 3-4
[0] [0] MPI startup(): Allreduce: 6: 1512039-2561761 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 2561762-8388608 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 8388609-10618873 & 3-4
[0] [0] MPI startup(): Allreduce: 8: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allreduce: 1: 0-11 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 11-24 & 5-8
[0] [0] MPI startup(): Allreduce: 6: 24-42 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 42-107 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 107-178 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 178-310 & 5-8
[0] [0] MPI startup(): Allreduce: 2: 310-594 & 5-8
[0] [0] MPI startup(): Allreduce: 5: 594-4431 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 4431-54874 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 54874-91696 & 5-8
[0] [0] MPI startup(): Allreduce: 6: 91696-175538 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 175538-383770 & 5-8
[0] [0] MPI startup(): Allreduce: 2: 383770-684262 & 5-8
[0] [0] MPI startup(): Allreduce: 3: 0-2147483647 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 0-11 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 11-24 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 6: 24-42 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 42-107 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 107-178 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 178-310 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 2: 310-594 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 5: 594-4431 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 4431-54874 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 54874-91696 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 6: 91696-175538 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 175538-383770 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 2: 383770-32006608 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 3: 0-2147483647 & 9-2147483647
[0] [0] MPI startup(): Alltoall: 3: 0-129493 & 0-2
[0] [0] MPI startup(): Alltoall: 3: 1080889-3453431 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 129493-1080889 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 0-2147483647 & 3-4
[0] [0] MPI startup(): Alltoall: 1: 1-64 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 2: 65-572235 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 4: 572236-1736997 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 3: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Alltoallv: 2: 0-2147483647 & 3-4
[0] [0] MPI startup(): Alltoallv: 2: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Barrier: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Barrier: 6: 0-2147483647 & 3-4
[0] [0] MPI startup(): Barrier: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Bcast: 7: 0-8 & 0-2
[0] [0] MPI startup(): Bcast: 7: 24-64 & 0-2
[0] [0] MPI startup(): Bcast: 7: 11264-52186 & 0-2
[0] [0] MPI startup(): Bcast: 7: 112045-131072 & 0-2
[1] [1] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=4) Fabric(intra=6 inter=6 flags=0x0)
[0] [0] MPI startup(): Bcast: 7: 1048576-2097152 & 0-2
[0] [0] MPI startup(): Bcast: 1: 8-24 & 0-2
[0] [0] MPI startup(): Bcast: 1: 64-11264 & 0-2
[0] [0] MPI startup(): Bcast: 1: 52186-112045 & 0-2
[0] [0] MPI startup(): Bcast: 1: 131072-1048576 & 0-2
[0] [0] MPI startup(): Bcast: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Bcast: 1: 1-1 & 3-4
[0] [0] MPI startup(): Bcast: 5: 2-3 & 3-4
[0] [0] MPI startup(): Bcast: 1: 4-5 & 3-4
[0] [0] MPI startup(): Bcast: 6: 6-11 & 3-4
[0] [0] MPI startup(): Bcast: 5: 12-24 & 3-4
[0] [0] MPI startup(): Bcast: 4: 25-141 & 3-4
[0] [0] MPI startup(): Bcast: 7: 142-370 & 3-4
[0] [0] MPI startup(): Bcast: 3: 371-680 & 3-4
[0] [0] MPI startup(): Bcast: 4: 681-3894 & 3-4
[0] [0] MPI startup(): Bcast: 1: 3895-4494 & 3-4
[0] [0] MPI startup(): Bcast: 7: 4495-14778 & 3-4
[0] [0] MPI startup(): Bcast: 4: 14779-18223 & 3-4
[0] [0] MPI startup(): Bcast: 7: 18224-36738 & 3-4
[0] [0] MPI startup(): Bcast: 3: 0-2147483647 & 3-4
[0] [0] MPI startup(): Bcast: 1: 0-10 & 5-2147483647
[0] [0] MPI startup(): Bcast: 1: 175-16799 & 5-2147483647
[0] [0] MPI startup(): Bcast: 6: 10-32 & 5-2147483647
[0] [0] MPI startup(): Bcast: 6: 32-175 & 5-2147483647
[0] [0] MPI startup(): Bcast: 7: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Gather: 2: 73643-172031 & 0-2
[0] [0] MPI startup(): Gather: 3: 0-853 & 0-2
[0] [0] MPI startup(): Gather: 3: 54613-73643 & 0-2
[0] [0] MPI startup(): Gather: 3: 262144-524288 & 0-2
[0] [0] MPI startup(): Gather: 1: 853-54613 & 0-2
[0] [0] MPI startup(): Gather: 1: 172031-262144 & 0-2
[0] [0] MPI startup(): Gather: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Gather: 2: 34148-129691 & 3-2147483647
[0] [0] MPI startup(): Gather: 2: 503316-2506634 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 0-34148 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 129691-503316 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 0-2147483647 & 3-2147483647
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 5-26 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 26-47 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 47-98 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 98-188 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 188-362 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 2: 362-588 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 588-1951 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 1951-11702 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 11702-23138 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 23138-58229 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 58229-191964 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 2: 191964-2656092 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 0-2147483647 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 4: 0-4 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 4-12 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 12-45 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 45-85 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 85-391 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 391-596 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 2: 596-1927 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 1927-2286 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 2286-7442 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 7442-10726 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 10726-45950 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 45950-101084 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 101084-159597 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 159597-423110 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 2: 423110-578734 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 578734-1329975 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 1329975-4146461 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 0-2147483647 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 0-5 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 5-28 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 28-50 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 50-197 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 197-721 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 2: 721-3207 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 3207-5980 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 5980-11416 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 11416-104215 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 104215-277330 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 277330-630522 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 630522-2659184 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 4-8 & 0-2
[0] [0] MPI startup(): Reduce: 3: 9-29 & 0-2
[0] [0] MPI startup(): Reduce: 2: 30-37 & 0-2
[0] [0] MPI startup(): Reduce: 3: 38-215 & 0-2
[0] [0] MPI startup(): Reduce: 2: 216-315 & 0-2
[0] [0] MPI startup(): Reduce: 5: 316-775 & 0-2
[0] [0] MPI startup(): Reduce: 2: 776-4045 & 0-2
[0] [0] MPI startup(): Reduce: 4: 4-6 & 3-4
[0] [0] MPI startup(): Reduce: 3: 7-11 & 3-4
[0] [0] MPI startup(): Reduce: 6: 12-16 & 3-4
[0] [0] MPI startup(): Reduce: 4: 17-34 & 3-4
[0] [0] MPI startup(): Reduce: 2: 35-99 & 3-4
[0] [0] MPI startup(): Reduce: 4: 100-230 & 3-4
[0] [0] MPI startup(): Reduce: 6: 231-275 & 3-4
[0] [0] MPI startup(): Reduce: 1: 276-1040 & 3-4
[0] [0] MPI startup(): Reduce: 3: 1041-3895 & 3-4
[0] [0] MPI startup(): Reduce: 6: 3896-4326 & 3-4
[0] [0] MPI startup(): Reduce: 3: 4327-10163 & 3-4
[0] [0] MPI startup(): Reduce: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Reduce: 2: 4-26 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 27-39 & 5-2147483647
[0] [0] MPI startup(): Reduce: 2: 40-230 & 5-2147483647
[0] [0] MPI startup(): Reduce: 3: 231-257 & 5-2147483647
[0] [0] MPI startup(): Reduce: 2: 258-718 & 5-2147483647
[0] [0] MPI startup(): Reduce: 3: 719-2436 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 2437-6344 & 5-2147483647
[0] [0] MPI startup(): Reduce: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Scatter: 1: 0-1 & 0-2
[0] [0] MPI startup(): Scatter: 1: 4-12 & 0-2
[0] [0] MPI startup(): Scatter: 1: 19-2048 & 0-2
[0] [0] MPI startup(): Scatter: 3: 2048-85701 & 0-2
[0] [0] MPI startup(): Scatter: 3: 165767-466939 & 0-2
[0] [0] MPI startup(): Scatter: 3: 524288-2336552 & 0-2
[0] [0] MPI startup(): Scatter: 2: 1-4 & 0-2
[0] [0] MPI startup(): Scatter: 2: 12-19 & 0-2
[0] [0] MPI startup(): Scatter: 2: 85701-165767 & 0-2
[0] [0] MPI startup(): Scatter: 2: 466939-524288 & 0-2
[0] [0] MPI startup(): Scatter: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Scatter: 3: 0-1909200 & 3-2147483647
[0] [0] MPI startup(): Scatter: 2: 0-2147483647 & 3-2147483647
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Rank    Pid      Node name   Pin cpu
[0] [0] MPI startup(): 0       125      jfz1r04h18  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
[0]                                   30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
[0]                                   ,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71}
[0] [0] MPI startup(): 1       97       jfz1r04h19  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
[0]                                   30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
[0]                                   ,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71}
[0] [0] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=4) Fabric(intra=6 inter=6 flags=0x0)
[0] [0] MPI startup(): I_MPI_COLL_INTRANODE=pt2pt
[0] [0] MPI startup(): I_MPI_DEBUG=6
[0] [0] MPI startup(): I_MPI_FABRICS=tcp
[0] [0] MPI startup(): I_MPI_FALLBACK=0
[0] [0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=hfi1_0:0,i40iw0:0,i40iw1:0
[0] [0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] [0] MPI startup(): I_MPI_PIN_MAPPING=1:0 0
[0] [0] MPI startup(): I_MPI_TCP_NETMASK=enp134s0f0
[0] I0427 11:07:09.561863   125 caffe.cpp:742] Number of groups: 1, group size: 2, number of parameter servers: 0
[1] I0427 11:07:09.567484    97 caffe.cpp:285] Use CPU.
[1] I0427 11:07:09.567849    97 solver.cpp:107] Initializing solver from parameters:
[1] test_iter: 100
[1] test_interval: 10000
[1] base_lr: 0.01
[1] display: 100
[1] max_iter: 50
[1] lr_policy: "inv"
[1] gamma: 0.0001
[1] power: 0.75
[1] momentum: 0.9
[1] weight_decay: 0.0005
[1] snapshot: 10000
[1] snapshot_prefix: "examples/mnist/lenet_mlsl"
[1] solver_mode: CPU
[1] net: "examples/mnist/lenet_train_test_mlsl.prototxt"
[1] train_state {
[1]   level: 0
[1]   stage: ""
[1] }
[1] I0427 11:07:09.567863    97 solver.cpp:153] Creating training net from net file: examples/mnist/lenet_train_test_mlsl.prototxt
[1] I0427 11:07:09.569360    97 cpu_info.cpp:453] Processor speed [MHz]: 2300
[1] I0427 11:07:09.569368    97 cpu_info.cpp:456] Total number of sockets: 2
[1] I0427 11:07:09.569372    97 cpu_info.cpp:459] Total number of CPU cores: 36
[1] I0427 11:07:09.569375    97 cpu_info.cpp:462] Total number of processors: 72
[1] I0427 11:07:09.569376    97 cpu_info.cpp:465] GPU is used: no
[1] I0427 11:07:09.569380    97 cpu_info.cpp:468] OpenMP environmental variables are specified: yes
[1] I0427 11:07:09.569381    97 cpu_info.cpp:471] OpenMP thread bind allowed: no
[0] I0427 11:07:09.565732   125 caffe.cpp:285] Use CPU.
[0] I0427 11:07:09.565878   125 solver.cpp:107] Initializing solver from parameters:
[0] test_iter: 100
[0] test_interval: 10000
[0] base_lr: 0.01
[0] display: 100
[0] max_iter: 50
[0] lr_policy: "inv"
[0] gamma: 0.0001
[0] power: 0.75
[0] momentum: 0.9
[0] weight_decay: 0.0005
[0] snapshot: 10000
[0] snapshot_prefix: "examples/mnist/lenet_mlsl"
[0] solver_mode: CPU
[0] net: "examples/mnist/lenet_train_test_mlsl.prototxt"
[0] train_state {
[0]   level: 0
[0]   stage: ""
[0] }
[0] I0427 11:07:09.565899   125 solver.cpp:153] Creating training net from net file: examples/mnist/lenet_train_test_mlsl.prototxt
[0] I0427 11:07:09.569108   125 cpu_info.cpp:453] Processor speed [MHz]: 2300
[0] I0427 11:07:09.569123   125 cpu_info.cpp:456] Total number of sockets: 2
[0] I0427 11:07:09.569128   125 cpu_info.cpp:459] Total number of CPU cores: 36
[0] I0427 11:07:09.569133   125 cpu_info.cpp:462] Total number of processors: 72
[0] I0427 11:07:09.569135   125 cpu_info.cpp:465] GPU is used: no
[0] I0427 11:07:09.569140   125 cpu_info.cpp:468] OpenMP environmental variables are specified: yes
[0] I0427 11:07:09.569144   125 cpu_info.cpp:471] OpenMP thread bind allowed: no
[1] I0427 11:07:09.576323    97 cpu_info.cpp:474] Number of OpenMP threads: 36
[1] I0427 11:07:09.576455    97 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
[1] I0427 11:07:09.576479    97 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
[1] I0427 11:07:09.576784    97 net.cpp:207] Initializing net from parameters:
[1] I0427 11:07:09.576802    97 net.cpp:208]
[1] name: "LeNet"
[1] state {
[1]   phase: TRAIN
[1]   level: 0
[1]   stage: ""
[1] }
[1] engine: "MKLDNN"
[1] compile_net_state {
[1]   bn_scale_remove: false
[1]   bn_scale_merge: false
[1] }
[1] layer {
[1]   name: "mnist"
[1]   type: "Data"
[1]   top: "data"
[1]   top: "label"
[1]   include {
[1]     phase: TRAIN
[1]   }
[1]   transform_param {
[1]     scale: 0.00390625
[1]   }
[1]   data_param {
[1]     source: "examples/mnist/mnist_train_lmdb"
[1]     batch_size: 64
[1]     backend: LMDB
[1]   }
[1] }
[1] layer {
[1]   name: "conv1"
[1]   type: "Convolution"
[1]   bottom: "data"
[1]   top: "conv1"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   convolution_param {
[1]     num_output: 20
[1]     bias_term: false
[1]     kernel_size: 5
[1]     stride: 1
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "pool1"
[1]   type: "Pooling"
[1]   bottom: "conv1"
[1]   top: "pool1"
[1]   pooling_param {
[1]     pool: MAX
[1]     kernel_size: 2
[1]     stride: 2
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "conv2"
[1]   type: "Convolution"
[1]   bottom: "pool1"
[1]   top: "conv2"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   convolution_param {
[1]     num_output: 50
[1]     bias_term: false
[1]     kernel_size: 5
[1]     stride: 1
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "pool2"
[1]   type: "Pooling"
[1]   bottom: "conv2"
[1]   top: "pool2"
[1]   pooling_param {
[1]     pool: MAX
[1]     kernel_size: 2
[1]     stride: 2
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "ip1"
[1]   type: "InnerProduct"
[1]   bottom: "pool2"
[1]   top: "ip1"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   inner_product_param {
[1]     num_output: 500
[1]     bias_term: false
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]   }
[1] }
[1] layer {
[1]   name: "relu1"
[1]   type: "ReLU"
[1]   bottom: "ip1"
[1]   top: "ip1"
[1]   relu_param {
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "ip2"
[1]   type: "InnerProduct"
[1]   bottom: "ip1"
[1]   top: "ip2"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   inner_product_param {
[1]     num_output: 10
[1]     bias_term: false
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]   }
[1] }
[1] layer {
[1]   name: "loss"
[1]   type: "SoftmaxWithLoss"
[1]   bottom: "ip2"
[1]   bottom: "label"
[1]   top: "loss"
[1] }
[1] I0427 11:07:09.576978    97 layer_factory.hpp:114] Creating layer mnist
[1] I0427 11:07:09.577211    97 net.cpp:265] Creating Layer mnist
[1] I0427 11:07:09.577231    97 net.cpp:1238] mnist -> data
[1] I0427 11:07:09.577255    97 net.cpp:1238] mnist -> label
[1] W0427 11:07:09.577289    97 net.cpp:335] SetMinibatchSize 64
[1] I0427 11:07:09.577648    99 internal_thread.cpp:135] Internal thread is affinitized to core 70
[1] I0427 11:07:09.577898    99 db_lmdb.cpp:72] Opened lmdb examples/mnist/mnist_train_lmdb
[1] I0427 11:07:09.578017    97 data_layer.cpp:80] output data size: 64,1,28,28
[0] I0427 11:07:09.576759   125 cpu_info.cpp:474] Number of OpenMP threads: 36
[0] I0427 11:07:09.576941   125 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
[0] I0427 11:07:09.576974   125 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
[0] I0427 11:07:09.577446   125 net.cpp:207] Initializing net from parameters:
[0] I0427 11:07:09.577476   125 net.cpp:208]
[0] name: "LeNet"
[0] state {
[0]   phase: TRAIN
[0]   level: 0
[0]   stage: ""
[0] }
[0] engine: "MKLDNN"
[0] compile_net_state {
[0]   bn_scale_remove: false
[0]   bn_scale_merge: false
[0] }
[0] layer {
[0]   name: "mnist"
[0]   type: "Data"
[0]   top: "data"
[0]   top: "label"
[0]   include {
[0]     phase: TRAIN
[0]   }
[0]   transform_param {
[0]     scale: 0.00390625
[0]   }
[0]   data_param {
[0]     source: "examples/mnist/mnist_train_lmdb"
[0]     batch_size: 64
[0]     backend: LMDB
[0]   }
[0] }
[0] layer {
[0]   name: "conv1"
[0]   type: "Convolution"
[0]   bottom: "data"
[0]   top: "conv1"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   convolution_param {
[0]     num_output: 20
[0]     bias_term: false
[0]     kernel_size: 5
[0]     stride: 1
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "pool1"
[0]   type: "Pooling"
[0]   bottom: "conv1"
[0]   top: "pool1"
[0]   pooling_param {
[0]     pool: MAX
[0]     kernel_size: 2
[0]     stride: 2
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "conv2"
[0]   type: "Convolution"
[0]   bottom: "pool1"
[0]   top: "conv2"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   convolution_param {
[0]     num_output: 50
[0]     bias_term: false
[0]     kernel_size: 5
[0]     stride: 1
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "pool2"
[0]   type: "Pooling"
[0]   bottom: "conv2"
[0]   top: "pool2"
[0]   pooling_param {
[0]     pool: MAX
[0]     kernel_size: 2
[0]     stride: 2
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "ip1"
[0]   type: "InnerProduct"
[0]   bottom: "pool2"
[0]   top: "ip1"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   inner_product_param {
[0]     num_output: 500
[0]     bias_term: false
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]   }
[0] }
[0] layer {
[0]   name: "relu1"
[0]   type: "ReLU"
[0]   bottom: "ip1"
[0]   top: "ip1"
[0]   relu_param {
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "ip2"
[0]   type: "InnerProduct"
[0]   bottom: "ip1"
[0]   top: "ip2"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   inner_product_param {
[0]     num_output: 10
[0]     bias_term: false
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]   }
[0] }
[0] layer {
[0]   name: "loss"
[0]   type: "SoftmaxWithLoss"
[0]   bottom: "ip2"
[0]   bottom: "label"
[0]   top: "loss"
[0] }
[0] I0427 11:07:09.577767   125 layer_factory.hpp:114] Creating layer mnist
[0] I0427 11:07:09.578042   125 net.cpp:265] Creating Layer mnist
[0] I0427 11:07:09.578061   125 net.cpp:1238] mnist -> data
[0] I0427 11:07:09.578085   125 net.cpp:1238] mnist -> label
[0] W0427 11:07:09.578212   125 net.cpp:335] SetMinibatchSize 64
[0] I0427 11:07:09.578627   127 internal_thread.cpp:135] Internal thread is affinitized to core 70
[0] I0427 11:07:09.578936   127 db_lmdb.cpp:72] Opened lmdb examples/mnist/mnist_train_lmdb
[0] I0427 11:07:09.579085   125 data_layer.cpp:80] output data size: 64,1,28,28
[0] I0427 11:07:09.580693   125 net.cpp:360] Setting up mnist
[0] I0427 11:07:09.580719   125 net.cpp:367] Top shape: 64 1 28 28 (50176)
[0] I0427 11:07:09.580727   125 net.cpp:367] Top shape: 64 (64)
[0] I0427 11:07:09.580734   125 net.cpp:375] Memory required for data: 200960
[0] I0427 11:07:09.580742   125 layer_factory.hpp:114] Creating layer conv1
[1] I0427 11:07:09.584623    97 net.cpp:360] Setting up mnist
[0] I0427 11:07:09.580778   125 net.cpp:265] Creating Layer conv1
[0] I0427 11:07:09.580785   125 net.cpp:1264] conv1 <- data
[1] I0427 11:07:09.584648    97 net.cpp:367] Top shape: 64 1 28 28 (50176)
[1] I0427 11:07:09.584658    97 net.cpp:367] Top shape: 64 (64)
[1] I0427 11:07:09.584663    97 net.cpp:375] Memory required for data: 200960
[1] I0427 11:07:09.584671    97 layer_factory.hpp:114] Creating layer conv1
[0] I0427 11:07:09.580799   125 net.cpp:1238] conv1 -> conv1
[1] I0427 11:07:09.584699    97 net.cpp:265] Creating Layer conv1
[1] I0427 11:07:09.584709    97 net.cpp:1264] conv1 <- data
[1] I0427 11:07:09.584722    97 net.cpp:1238] conv1 -> conv1
[1] I0427 11:07:09.593472    97 net.cpp:360] Setting up conv1
[1] I0427 11:07:09.593502    97 net.cpp:367] Top shape: 64 20 24 24 (737280)
[1] I0427 11:07:09.593509    97 net.cpp:375] Memory required for data: 3150080
[1] I0427 11:07:09.593533    97 layer_factory.hpp:114] Creating layer pool1
[1] I0427 11:07:09.593564    97 net.cpp:265] Creating Layer pool1
[1] I0427 11:07:09.593571    97 net.cpp:1264] pool1 <- conv1
[1] I0427 11:07:09.593588    97 net.cpp:1238] pool1 -> pool1
[1] I0427 11:07:09.593619    97 net.cpp:360] Setting up pool1
[1] I0427 11:07:09.593629    97 net.cpp:367] Top shape: 64 20 12 12 (184320)
[1] I0427 11:07:09.593634    97 net.cpp:375] Memory required for data: 3887360
[1] I0427 11:07:09.593641    97 layer_factory.hpp:114] Creating layer conv2
[1] I0427 11:07:09.593660    97 net.cpp:265] Creating Layer conv2
[1] I0427 11:07:09.593668    97 net.cpp:1264] conv2 <- pool1
[1] I0427 11:07:09.593679    97 net.cpp:1238] conv2 -> conv2
[0] I0427 11:07:09.589804   125 net.cpp:360] Setting up conv1
[0] I0427 11:07:09.589835   125 net.cpp:367] Top shape: 64 20 24 24 (737280)
[0] I0427 11:07:09.589841   125 net.cpp:375] Memory required for data: 3150080
[0] I0427 11:07:09.589869   125 layer_factory.hpp:114] Creating layer pool1
[0] I0427 11:07:09.589892   125 net.cpp:265] Creating Layer pool1
[0] I0427 11:07:09.589900   125 net.cpp:1264] pool1 <- conv1
[0] I0427 11:07:09.589924   125 net.cpp:1238] pool1 -> pool1
[0] I0427 11:07:09.589958   125 net.cpp:360] Setting up pool1
[0] I0427 11:07:09.589969   125 net.cpp:367] Top shape: 64 20 12 12 (184320)
[0] I0427 11:07:09.589975   125 net.cpp:375] Memory required for data: 3887360
[0] I0427 11:07:09.589982   125 layer_factory.hpp:114] Creating layer conv2
[0] I0427 11:07:09.590003   125 net.cpp:265] Creating Layer conv2
[0] I0427 11:07:09.590011   125 net.cpp:1264] conv2 <- pool1
[0] I0427 11:07:09.590021   125 net.cpp:1238] conv2 -> conv2
[1] I0427 11:07:09.596592    97 net.cpp:360] Setting up conv2
[1] I0427 11:07:09.596607    97 net.cpp:367] Top shape: 64 50 8 8 (204800)
[1] I0427 11:07:09.596612    97 net.cpp:375] Memory required for data: 4706560
[1] I0427 11:07:09.596623    97 layer_factory.hpp:114] Creating layer pool2
[1] I0427 11:07:09.596637    97 net.cpp:265] Creating Layer pool2
[1] I0427 11:07:09.596644    97 net.cpp:1264] pool2 <- conv2
[1] I0427 11:07:09.596655    97 net.cpp:1238] pool2 -> pool2
[1] I0427 11:07:09.596678    97 net.cpp:360] Setting up pool2
[1] I0427 11:07:09.596689    97 net.cpp:367] Top shape: 64 50 4 4 (51200)
[1] I0427 11:07:09.596695    97 net.cpp:375] Memory required for data: 4911360
[1] I0427 11:07:09.596700    97 layer_factory.hpp:114] Creating layer ip1
[1] I0427 11:07:09.596725    97 net.cpp:265] Creating Layer ip1
[1] I0427 11:07:09.596734    97 net.cpp:1264] ip1 <- pool2
[1] I0427 11:07:09.596745    97 net.cpp:1238] ip1 -> ip1
[0] I0427 11:07:09.593422   125 net.cpp:360] Setting up conv2
[0] I0427 11:07:09.593438   125 net.cpp:367] Top shape: 64 50 8 8 (204800)
[0] I0427 11:07:09.593443   125 net.cpp:375] Memory required for data: 4706560
[0] I0427 11:07:09.593453   125 layer_factory.hpp:114] Creating layer pool2
[0] I0427 11:07:09.593467   125 net.cpp:265] Creating Layer pool2
[0] I0427 11:07:09.593474   125 net.cpp:1264] pool2 <- conv2
[0] I0427 11:07:09.593487   125 net.cpp:1238] pool2 -> pool2
[0] I0427 11:07:09.593509   125 net.cpp:360] Setting up pool2
[0] I0427 11:07:09.593516   125 net.cpp:367] Top shape: 64 50 4 4 (51200)
[0] I0427 11:07:09.593523   125 net.cpp:375] Memory required for data: 4911360
[0] I0427 11:07:09.593528   125 layer_factory.hpp:114] Creating layer ip1
[0] I0427 11:07:09.593554   125 net.cpp:265] Creating Layer ip1
[0] I0427 11:07:09.593562   125 net.cpp:1264] ip1 <- pool2
[0] I0427 11:07:09.593575   125 net.cpp:1238] ip1 -> ip1
[1] I0427 11:07:09.601646    97 net.cpp:360] Setting up ip1
[1] I0427 11:07:09.601660    97 net.cpp:367] Top shape: 64 500 (32000)
[1] I0427 11:07:09.601665    97 net.cpp:375] Memory required for data: 5039360
[1] I0427 11:07:09.601673    97 layer_factory.hpp:114] Creating layer relu1
[1] I0427 11:07:09.601687    97 net.cpp:265] Creating Layer relu1
[1] I0427 11:07:09.601693    97 net.cpp:1264] relu1 <- ip1
[1] I0427 11:07:09.601701    97 net.cpp:1225] relu1 -> ip1 (in-place)
[1] I0427 11:07:09.601728    97 net.cpp:360] Setting up relu1
[1] I0427 11:07:09.601734    97 net.cpp:367] Top shape: 64 500 (32000)
[1] I0427 11:07:09.601737    97 net.cpp:375] Memory required for data: 5167360
[1] I0427 11:07:09.601742    97 layer_factory.hpp:114] Creating layer ip2
[1] I0427 11:07:09.601764    97 net.cpp:265] Creating Layer ip2
[1] I0427 11:07:09.601769    97 net.cpp:1264] ip2 <- ip1
[1] I0427 11:07:09.601778    97 net.cpp:1238] ip2 -> ip2
[1] I0427 11:07:09.601825    97 net.cpp:360] Setting up ip2
[1] I0427 11:07:09.601832    97 net.cpp:367] Top shape: 64 10 (640)
[1] I0427 11:07:09.601836    97 net.cpp:375] Memory required for data: 5169920
[1] I0427 11:07:09.601841    97 layer_factory.hpp:114] Creating layer loss
[1] I0427 11:07:09.601852    97 net.cpp:265] Creating Layer loss
[1] I0427 11:07:09.601857    97 net.cpp:1264] loss <- ip2
[1] I0427 11:07:09.601882    97 net.cpp:1264] loss <- label
[0] I0427 11:07:09.598147   125 net.cpp:360] Setting up ip1
[0] I0427 11:07:09.598165   125 net.cpp:367] Top shape: 64 500 (32000)
[0] I0427 11:07:09.598167   125 net.cpp:375] Memory required for data: 5039360
[0] I0427 11:07:09.598176   125 layer_factory.hpp:114] Creating layer relu1
[0] I0427 11:07:09.598192   125 net.cpp:265] Creating Layer relu1
[0] I0427 11:07:09.598196   125 net.cpp:1264] relu1 <- ip1
[0] I0427 11:07:09.598202   125 net.cpp:1225] relu1 -> ip1 (in-place)
[1] I0427 11:07:09.601887    97 net.cpp:1238] loss -> loss
[0] I0427 11:07:09.598230   125 net.cpp:360] Setting up relu1
[0] I0427 11:07:09.598235   125 net.cpp:367] Top shape: 64 500 (32000)
[0] I0427 11:07:09.598253   125 net.cpp:375] Memory required for data: 5167360
[0] I0427 11:07:09.598256   125 layer_factory.hpp:114] Creating layer ip2
[1] I0427 11:07:09.601902    97 layer_factory.hpp:114] Creating layer loss
[0] I0427 11:07:09.598268   125 net.cpp:265] Creating Layer ip2
[0] I0427 11:07:09.598270   125 net.cpp:1264] ip2 <- ip1
[1] I0427 11:07:09.601929    97 net.cpp:360] Setting up loss
[1] I0427 11:07:09.601938    97 net.cpp:367] Top shape: (1)
[0] I0427 11:07:09.598278   125 net.cpp:1238] ip2 -> ip2
[1] I0427 11:07:09.601940    97 net.cpp:370]     with loss weight 0.5
[1] I0427 11:07:09.601959    97 net.cpp:375] Memory required for data: 5169924
[0] I0427 11:07:09.598335   125 net.cpp:360] Setting up ip2
[0] I0427 11:07:09.598343   125 net.cpp:367] Top shape: 64 10 (640)
[0] I0427 11:07:09.598346   125 net.cpp:375] Memory required for data: 5169920
[1] I0427 11:07:09.601963    97 net.cpp:437] loss needs backward computation.
[1] I0427 11:07:09.601968    97 net.cpp:437] ip2 needs backward computation.
[0] I0427 11:07:09.598352   125 layer_factory.hpp:114] Creating layer loss
[1] I0427 11:07:09.601971    97 net.cpp:437] relu1 needs backward computation.
[0] I0427 11:07:09.598363   125 net.cpp:265] Creating Layer loss
[1] I0427 11:07:09.601975    97 net.cpp:437] ip1 needs backward computation.
[0] I0427 11:07:09.598367   125 net.cpp:1264] loss <- ip2
[0] I0427 11:07:09.598397   125 net.cpp:1264] loss <- label
[1] I0427 11:07:09.601984    97 net.cpp:437] pool2 needs backward computation.
[0] I0427 11:07:09.598403   125 net.cpp:1238] loss -> loss
[1] I0427 11:07:09.601986    97 net.cpp:437] conv2 needs backward computation.
[1] I0427 11:07:09.601992    97 net.cpp:437] pool1 needs backward computation.
[0] I0427 11:07:09.598418   125 layer_factory.hpp:114] Creating layer loss
[1] I0427 11:07:09.601997    97 net.cpp:437] conv1 needs backward computation.
[0] I0427 11:07:09.598453   125 net.cpp:360] Setting up loss
[0] I0427 11:07:09.598461   125 net.cpp:367] Top shape: (1)
[1] I0427 11:07:09.602001    97 net.cpp:439] mnist does not need backward computation.
[0] I0427 11:07:09.598465   125 net.cpp:370]     with loss weight 0.5
[0] I0427 11:07:09.598489   125 net.cpp:375] Memory required for data: 5169924
[0] I0427 11:07:09.598493   125 net.cpp:437] loss needs backward computation.
[1] I0427 11:07:09.602005    97 net.cpp:481] This network produces output loss
[0] I0427 11:07:09.598498   125 net.cpp:437] ip2 needs backward computation.
[1] I0427 11:07:09.602017    97 net.cpp:521] Network initialization done.
[0] I0427 11:07:09.598502   125 net.cpp:437] relu1 needs backward computation.
[0] I0427 11:07:09.598505   125 net.cpp:437] ip1 needs backward computation.
[0] I0427 11:07:09.598510   125 net.cpp:437] pool2 needs backward computation.
[1] I0427 11:07:09.602191    97 solver.cpp:249] Creating test net (#0) specified by net file: examples/mnist/lenet_train_test_mlsl.prototxt
[1] I0427 11:07:09.602200    97 cpu_info.cpp:453] Processor speed [MHz]: 2300
[0] I0427 11:07:09.598515   125 net.cpp:437] conv2 needs backward computation.
[1] I0427 11:07:09.602202    97 cpu_info.cpp:456] Total number of sockets: 2
[1] I0427 11:07:09.602205    97 cpu_info.cpp:459] Total number of CPU cores: 36
[1] I0427 11:07:09.602210    97 cpu_info.cpp:462] Total number of processors: 72
[0] I0427 11:07:09.598520   125 net.cpp:437] pool1 needs backward computation.
[1] I0427 11:07:09.602213    97 cpu_info.cpp:465] GPU is used: no
[1] I0427 11:07:09.602217    97 cpu_info.cpp:468] OpenMP environmental variables are specified: yes
[0] I0427 11:07:09.598523   125 net.cpp:437] conv1 needs backward computation.
[0] I0427 11:07:09.598528   125 net.cpp:439] mnist does not need backward computation.
[0] I0427 11:07:09.598531   125 net.cpp:481] This network produces output loss
[1] I0427 11:07:09.602221    97 cpu_info.cpp:471] OpenMP thread bind allowed: no
[0] I0427 11:07:09.598546   125 net.cpp:521] Network initialization done.
[1] I0427 11:07:09.602226    97 cpu_info.cpp:474] Number of OpenMP threads: 36
[0] I0427 11:07:09.598822   125 solver.cpp:249] Creating test net (#0) specified by net file: examples/mnist/lenet_train_test_mlsl.prototxt
[0] I0427 11:07:09.598834   125 cpu_info.cpp:453] Processor speed [MHz]: 2300
[1] I0427 11:07:09.602244    97 net.cpp:1052] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist
[0] I0427 11:07:09.598839   125 cpu_info.cpp:456] Total number of sockets: 2
[0] I0427 11:07:09.598841   125 cpu_info.cpp:459] Total number of CPU cores: 36
[0] I0427 11:07:09.598845   125 cpu_info.cpp:462] Total number of processors: 72
[0] I0427 11:07:09.598848   125 cpu_info.cpp:465] GPU is used: no
[1] I0427 11:07:09.602470    97 net.cpp:207] Initializing net from parameters:
[0] I0427 11:07:09.598852   125 cpu_info.cpp:468] OpenMP environmental variables are specified: yes
[0] I0427 11:07:09.598855   125 cpu_info.cpp:471] OpenMP thread bind allowed: no
[1] I0427 11:07:09.602483    97 net.cpp:208]
[0] I0427 11:07:09.598858   125 cpu_info.cpp:474] Number of OpenMP threads: 36
[1] name: "LeNet"
[1] state {
[1]   phase: TEST
[1] }
[1] engine: "MKLDNN"
[1] compile_net_state {
[1]   bn_scale_remove: false
[1]   bn_scale_merge: false
[1] }
[1] layer {
[1]   name: "mnist"
[1]   type: "Data"
[1]   top: "data"
[1]   top: "label"
[1]   include {
[1]     phase: TEST
[1]   }
[1]   transform_param {
[1]     scale: 0.00390625
[1]   }
[1]   data_param {
[1]     source: "examples/mnist/mnist_test_lmdb"
[1]     batch_size: 100
[1]     backend: LMDB
[1]   }
[1] }
[1] layer {
[1]   name: "label_mnist_1_split"
[1]   type: "Split"
[1]   bottom: "label"
[1]   top: "label_mnist_1_split_0"
[1]   top: "label_mnist_1_split_1"
[1] }
[1] layer {
[1]   name: "conv1"
[1]   type: "Convolution"
[1]   bottom: "data"
[1]   top: "conv1"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   convolution_param {
[1]     num_output: 20
[1]     bias_term: false
[1]     kernel_size: 5
[1]     stride: 1
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "pool1"
[1]   type: "Pooling"
[1]   bottom: "conv1"
[1]   top: "pool1"
[1]   pooling_param {
[1]     pool: MAX
[1]     kernel_size: 2
[1]     stride: 2
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "conv2"
[1]   type: "Convolution"
[1]   bottom: "pool1"
[1]   top: "conv2"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   convolution_param {
[1]     num_output: 50
[1]     bias_term: false
[1]     kernel_size: 5
[1]     stride: 1
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "pool2"
[1]   type: "Pooling"
[1]   bottom: "conv2"
[1]   top: "pool2"
[1]   pooling_param {
[1]     pool: MAX
[1]     kernel_size: 2
[1]     stride: 2
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "ip1"
[1]   type: "InnerProduct"
[1]   bottom: "pool2"
[1]   top: "ip1"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   inner_product_param {
[1]     num_output: 500
[1]     bias_term: false
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]   }
[1] }
[1] layer {
[1]   name: "relu1"
[1]   type: "ReLU"
[1]   bottom: "ip1"
[1]   top: "ip1"
[1]   relu_param {
[1]     engine: MKL2017
[1]   }
[1] }
[1] layer {
[1]   name: "ip2"
[1]   type: "InnerProduct"
[1]   bottom: "ip1"
[1]   top: "ip2"
[1]   param {
[1]     lr_mult: 1
[1]   }
[1]   inner_product_param {
[1]     num_output: 10
[1]     bias_term: false
[1]     weight_filler {
[1]       type: "xavier"
[1]     }
[1]   }
[1] }
[1] layer {
[1]   name: "ip2_ip2_0_split"
[1]   type: "Split"
[1]   bottom: "ip2"
[1]   top: "ip2_ip2_0_split_0"
[1]   top: "ip2_ip2_0_split_1"
[1] }
[1] layer {
[1]   name: "accuracy"
[1]   type: "Accuracy"
[1]   bottom: "ip2_ip2_0_split_0"
[1]   bottom: "label_mnist_1_split_0"
[1]   top: "accuracy"
[1]   include {
[1]     phase: TEST
[1]   }
[1] }
[1] layer {
[1]   name: "loss"
[1]   type: "SoftmaxWithLoss"
[1]   bottom: "ip2_ip2_0_split_1"
[1]   bottom: "label_mnist_1_split_1"
[1]   top: "loss"
[1] }
[0] I0427 11:07:09.598881   125 net.cpp:1052] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist
[1] I0427 11:07:09.602598    97 layer_factory.hpp:114] Creating layer mnist
[0] I0427 11:07:09.599122   125 net.cpp:207] Initializing net from parameters:
[1] I0427 11:07:09.602694    97 net.cpp:265] Creating Layer mnist
[1] I0427 11:07:09.602701    97 net.cpp:1238] mnist -> data
[0] I0427 11:07:09.599133   125 net.cpp:208]
[1] I0427 11:07:09.602710    97 net.cpp:1238] mnist -> label
[0] name: "LeNet"
[0] state {
[0]   phase: TEST
[0] }
[0] engine: "MKLDNN"
[0] compile_net_state {
[0]   bn_scale_remove: false
[0]   bn_scale_merge: false
[0] }
[0] layer {
[0]   name: "mnist"
[0]   type: "Data"
[0]   top: "data"
[0]   top: "label"
[0]   include {
[0]     phase: TEST
[0]   }
[0]   transform_param {
[0]     scale: 0.00390625
[0]   }
[0]   data_param {
[0]     source: "examples/mnist/mnist_test_lmdb"
[0]     batch_size: 100
[0]     backend: LMDB
[0]   }
[0] }
[0] layer {
[0]   name: "label_mnist_1_split"
[0]   type: "Split"
[0]   bottom: "label"
[0]   top: "label_mnist_1_split_0"
[0]   top: "label_mnist_1_split_1"
[0] }
[0] layer {
[0]   name: "conv1"
[0]   type: "Convolution"
[0]   bottom: "data"
[0]   top: "conv1"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   convolution_param {
[0]     num_output: 20
[0]     bias_term: false
[0]     kernel_size: 5
[0]     stride: 1
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "pool1"
[0]   type: "Pooling"
[0]   bottom: "conv1"
[0]   top: "pool1"
[0]   pooling_param {
[0]     pool: MAX
[0]     kernel_size: 2
[0]     stride: 2
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "conv2"
[0]   type: "Convolution"
[0]   bottom: "pool1"
[0]   top: "conv2"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   convolution_param {
[0]     num_output: 50
[0]     bias_term: false
[0]     kernel_size: 5
[0]     stride: 1
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "pool2"
[0]   type: "Pooling"
[0]   bottom: "conv2"
[0]   top: "pool2"
[0]   pooling_param {
[0]     pool: MAX
[0]     kernel_size: 2
[0]     stride: 2
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "ip1"
[0]   type: "InnerProduct"
[0]   bottom: "pool2"
[0]   top: "ip1"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   inner_product_param {
[0]     num_output: 500
[0]     bias_term: false
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]   }
[0] }
[0] layer {
[0]   name: "relu1"
[0]   type: "ReLU"
[0]   bottom: "ip1"
[0]   top: "ip1"
[0]   relu_param {
[0]     engine: MKL2017
[0]   }
[0] }
[0] layer {
[0]   name: "ip2"
[0]   type: "InnerProduct"
[0]   bottom: "ip1"
[0]   top: "ip2"
[0]   param {
[0]     lr_mult: 1
[0]   }
[0]   inner_product_param {
[0]     num_output: 10
[0]     bias_term: false
[0]     weight_filler {
[0]       type: "xavier"
[0]     }
[0]   }
[0] }
[0] layer {
[0]   name: "ip2_ip2_0_split"
[0]   type: "Split"
[0]   bottom: "ip2"
[0]   top: "ip2_ip2_0_split_0"
[0]   top: "ip2_ip2_0_split_1"
[0] }
[0] layer {
[0]   name: "accuracy"
[0]   type: "Accuracy"
[0]   bottom: "ip2_ip2_0_split_0"
[0]   bottom: "label_mnist_1_split_0"
[0]   top: "accuracy"
[0]   include {
[0]     phase: TEST
[0]   }
[0] }
[0] layer {
[0]   name: "loss"
[0]   type: "SoftmaxWithLoss"
[0]   bottom: "ip2_ip2_0_split_1"
[0]   bottom: "label_mnist_1_split_1"
[0]   top: "loss"
[0] }
[1] I0427 11:07:09.602819   100 internal_thread.cpp:135] Internal thread is affinitized to core 71
[0] I0427 11:07:09.599277   125 layer_factory.hpp:114] Creating layer mnist
[1] I0427 11:07:09.602931   100 db_lmdb.cpp:72] Opened lmdb examples/mnist/mnist_test_lmdb
[0] I0427 11:07:09.599380   125 net.cpp:265] Creating Layer mnist
[0] I0427 11:07:09.599388   125 net.cpp:1238] mnist -> data
[1] I0427 11:07:09.602979    97 data_layer.cpp:80] output data size: 100,1,28,28
[0] I0427 11:07:09.599397   125 net.cpp:1238] mnist -> label
[1] I0427 11:07:09.603960    97 net.cpp:360] Setting up mnist
[1] I0427 11:07:09.603971    97 net.cpp:367] Top shape: 100 1 28 28 (78400)
[0] I0427 11:07:09.599514   128 internal_thread.cpp:135] Internal thread is affinitized to core 71
[1] I0427 11:07:09.603976    97 net.cpp:367] Top shape: 100 (100)
[1] I0427 11:07:09.603979    97 net.cpp:375] Memory required for data: 314000
[0] I0427 11:07:09.599640   128 db_lmdb.cpp:72] Opened lmdb examples/mnist/mnist_test_lmdb
[1] I0427 11:07:09.603984    97 layer_factory.hpp:114] Creating layer label_mnist_1_split
[0] I0427 11:07:09.599706   125 data_layer.cpp:80] output data size: 100,1,28,28
[1] I0427 11:07:09.603998    97 net.cpp:265] Creating Layer label_mnist_1_split
[1] I0427 11:07:09.604004    97 net.cpp:1264] label_mnist_1_split <- label
[1] I0427 11:07:09.604012    97 net.cpp:1238] label_mnist_1_split -> label_mnist_1_split_0
[1] I0427 11:07:09.604019    97 net.cpp:1238] label_mnist_1_split -> label_mnist_1_split_1
[1] I0427 11:07:09.604041    97 net.cpp:360] Setting up label_mnist_1_split
[1] I0427 11:07:09.604048    97 net.cpp:367] Top shape: 100 (100)
[1] I0427 11:07:09.604053    97 net.cpp:367] Top shape: 100 (100)
[1] I0427 11:07:09.604055    97 net.cpp:375] Memory required for data: 314800
[1] I0427 11:07:09.604060    97 layer_factory.hpp:114] Creating layer conv1
[1] I0427 11:07:09.604071    97 net.cpp:265] Creating Layer conv1
[1] I0427 11:07:09.604077    97 net.cpp:1264] conv1 <- data
[1] I0427 11:07:09.604084    97 net.cpp:1238] conv1 -> conv1
[0] I0427 11:07:09.600899   125 net.cpp:360] Setting up mnist
[0] I0427 11:07:09.600913   125 net.cpp:367] Top shape: 100 1 28 28 (78400)
[0] I0427 11:07:09.600919   125 net.cpp:367] Top shape: 100 (100)
[0] I0427 11:07:09.600924   125 net.cpp:375] Memory required for data: 314000
[0] I0427 11:07:09.600927   125 layer_factory.hpp:114] Creating layer label_mnist_1_split
[0] I0427 11:07:09.600944   125 net.cpp:265] Creating Layer label_mnist_1_split
[0] I0427 11:07:09.600950   125 net.cpp:1264] label_mnist_1_split <- label
[0] I0427 11:07:09.600958   125 net.cpp:1238] label_mnist_1_split -> label_mnist_1_split_0
[0] I0427 11:07:09.600972   125 net.cpp:1238] label_mnist_1_split -> label_mnist_1_split_1
[0] I0427 11:07:09.600991   125 net.cpp:360] Setting up label_mnist_1_split
[0] I0427 11:07:09.600998   125 net.cpp:367] Top shape: 100 (100)
[0] I0427 11:07:09.601004   125 net.cpp:367] Top shape: 100 (100)
[0] I0427 11:07:09.601007   125 net.cpp:375] Memory required for data: 314800
[0] I0427 11:07:09.601011   125 layer_factory.hpp:114] Creating layer conv1
[0] I0427 11:07:09.601023   125 net.cpp:265] Creating Layer conv1
[0] I0427 11:07:09.601032   125 net.cpp:1264] conv1 <- data
[0] I0427 11:07:09.601037   125 net.cpp:1238] conv1 -> conv1
[1] I0427 11:07:09.605427    97 net.cpp:360] Setting up conv1
[1] I0427 11:07:09.605438    97 net.cpp:367] Top shape: 100 20 24 24 (1152000)
[1] I0427 11:07:09.605442    97 net.cpp:375] Memory required for data: 4922800
[1] I0427 11:07:09.605449    97 layer_factory.hpp:114] Creating layer pool1
[1] I0427 11:07:09.605459    97 net.cpp:265] Creating Layer pool1
[1] I0427 11:07:09.605468    97 net.cpp:1264] pool1 <- conv1
[1] I0427 11:07:09.605473    97 net.cpp:1238] pool1 -> pool1
[1] I0427 11:07:09.605485    97 net.cpp:360] Setting up pool1
[1] I0427 11:07:09.605494    97 net.cpp:367] Top shape: 100 20 12 12 (288000)
[1] I0427 11:07:09.605496    97 net.cpp:375] Memory required for data: 6074800
[1] I0427 11:07:09.605500    97 layer_factory.hpp:114] Creating layer conv2
[1] I0427 11:07:09.605515    97 net.cpp:265] Creating Layer conv2
[1] I0427 11:07:09.605520    97 net.cpp:1264] conv2 <- pool1
[1] I0427 11:07:09.605531    97 net.cpp:1238] conv2 -> conv2
[0] I0427 11:07:09.602587   125 net.cpp:360] Setting up conv1
[0] I0427 11:07:09.602598   125 net.cpp:367] Top shape: 100 20 24 24 (1152000)
[0] I0427 11:07:09.602602   125 net.cpp:375] Memory required for data: 4922800
[0] I0427 11:07:09.602609   125 layer_factory.hpp:114] Creating layer pool1
[0] I0427 11:07:09.602619   125 net.cpp:265] Creating Layer pool1
[0] I0427 11:07:09.602623   125 net.cpp:1264] pool1 <- conv1
[0] I0427 11:07:09.602630   125 net.cpp:1238] pool1 -> pool1
[0] I0427 11:07:09.602644   125 net.cpp:360] Setting up pool1
[0] I0427 11:07:09.602649   125 net.cpp:367] Top shape: 100 20 12 12 (288000)
[0] I0427 11:07:09.602654   125 net.cpp:375] Memory required for data: 6074800
[0] I0427 11:07:09.602659   125 layer_factory.hpp:114] Creating layer conv2
[0] I0427 11:07:09.602682   125 net.cpp:265] Creating Layer conv2
[0] I0427 11:07:09.602685   125 net.cpp:1264] conv2 <- pool1
[0] I0427 11:07:09.602691   125 net.cpp:1238] conv2 -> conv2
[1] I0427 11:07:09.607254    97 net.cpp:360] Setting up conv2
[1] I0427 11:07:09.607264    97 net.cpp:367] Top shape: 100 50 8 8 (320000)
[1] I0427 11:07:09.607269    97 net.cpp:375] Memory required for data: 7354800
[1] I0427 11:07:09.607275    97 layer_factory.hpp:114] Creating layer pool2
[1] I0427 11:07:09.607285    97 net.cpp:265] Creating Layer pool2
[1] I0427 11:07:09.607290    97 net.cpp:1264] pool2 <- conv2
[1] I0427 11:07:09.607298    97 net.cpp:1238] pool2 -> pool2
[1] I0427 11:07:09.607311    97 net.cpp:360] Setting up pool2
[1] I0427 11:07:09.607318    97 net.cpp:367] Top shape: 100 50 4 4 (80000)
[1] I0427 11:07:09.607322    97 net.cpp:375] Memory required for data: 7674800
[1] I0427 11:07:09.607336    97 layer_factory.hpp:114] Creating layer ip1
[1] I0427 11:07:09.607347    97 net.cpp:265] Creating Layer ip1
[1] I0427 11:07:09.607355    97 net.cpp:1264] ip1 <- pool2
[1] I0427 11:07:09.607362    97 net.cpp:1238] ip1 -> ip1
[0] I0427 11:07:09.604876   125 net.cpp:360] Setting up conv2
[0] I0427 11:07:09.604885   125 net.cpp:367] Top shape: 100 50 8 8 (320000)
[0] I0427 11:07:09.604889   125 net.cpp:375] Memory required for data: 7354800
[0] I0427 11:07:09.604895   125 layer_factory.hpp:114] Creating layer pool2
[0] I0427 11:07:09.604905   125 net.cpp:265] Creating Layer pool2
[0] I0427 11:07:09.604909   125 net.cpp:1264] pool2 <- conv2
[0] I0427 11:07:09.604917   125 net.cpp:1238] pool2 -> pool2
[0] I0427 11:07:09.604931   125 net.cpp:360] Setting up pool2
[0] I0427 11:07:09.604939   125 net.cpp:367] Top shape: 100 50 4 4 (80000)
[0] I0427 11:07:09.604943   125 net.cpp:375] Memory required for data: 7674800
[0] I0427 11:07:09.604948   125 layer_factory.hpp:114] Creating layer ip1
[0] I0427 11:07:09.604955   125 net.cpp:265] Creating Layer ip1
[0] I0427 11:07:09.604960   125 net.cpp:1264] ip1 <- pool2
[0] I0427 11:07:09.604975   125 net.cpp:1238] ip1 -> ip1
[1] I0427 11:07:09.610730    97 net.cpp:360] Setting up ip1
[1] I0427 11:07:09.610744    97 net.cpp:367] Top shape: 100 500 (50000)
[1] I0427 11:07:09.610749    97 net.cpp:375] Memory required for data: 7874800
[1] I0427 11:07:09.610756    97 layer_factory.hpp:114] Creating layer relu1
[1] I0427 11:07:09.610769    97 net.cpp:265] Creating Layer relu1
[1] I0427 11:07:09.610775    97 net.cpp:1264] relu1 <- ip1
[1] I0427 11:07:09.610783    97 net.cpp:1225] relu1 -> ip1 (in-place)
[1] I0427 11:07:09.610796    97 net.cpp:360] Setting up relu1
[1] I0427 11:07:09.610803    97 net.cpp:367] Top shape: 100 500 (50000)
[1] I0427 11:07:09.610807    97 net.cpp:375] Memory required for data: 8074800
[1] I0427 11:07:09.610828    97 layer_factory.hpp:114] Creating layer ip2
[1] I0427 11:07:09.610841    97 net.cpp:265] Creating Layer ip2
[1] I0427 11:07:09.610847    97 net.cpp:1264] ip2 <- ip1
[1] I0427 11:07:09.610855    97 net.cpp:1238] ip2 -> ip2
[1] I0427 11:07:09.610903    97 net.cpp:360] Setting up ip2
[1] I0427 11:07:09.610909    97 net.cpp:367] Top shape: 100 10 (1000)
[1] I0427 11:07:09.610913    97 net.cpp:375] Memory required for data: 8078800
[1] I0427 11:07:09.610920    97 layer_factory.hpp:114] Creating layer ip2_ip2_0_split
[1] I0427 11:07:09.610929    97 net.cpp:265] Creating Layer ip2_ip2_0_split
[1] I0427 11:07:09.610931    97 net.cpp:1264] ip2_ip2_0_split <- ip2
[1] I0427 11:07:09.610939    97 net.cpp:1238] ip2_ip2_0_split -> ip2_ip2_0_split_0
[1] I0427 11:07:09.610947    97 net.cpp:1238] ip2_ip2_0_split -> ip2_ip2_0_split_1
[1] I0427 11:07:09.610955    97 net.cpp:360] Setting up ip2_ip2_0_split
[1] I0427 11:07:09.610960    97 net.cpp:367] Top shape: 100 10 (1000)
[1] I0427 11:07:09.610965    97 net.cpp:367] Top shape: 100 10 (1000)
[1] I0427 11:07:09.610967    97 net.cpp:375] Memory required for data: 8086800
[1] I0427 11:07:09.610971    97 layer_factory.hpp:114] Creating layer accuracy
[1] I0427 11:07:09.610980    97 net.cpp:265] Creating Layer accuracy
[1] I0427 11:07:09.610987    97 net.cpp:1264] accuracy <- ip2_ip2_0_split_0
[1] I0427 11:07:09.610991    97 net.cpp:1264] accuracy <- label_mnist_1_split_0
[1] I0427 11:07:09.610999    97 net.cpp:1238] accuracy -> accuracy
[1] I0427 11:07:09.611009    97 net.cpp:360] Setting up accuracy
[1] I0427 11:07:09.611014    97 net.cpp:367] Top shape: (1)
[1] I0427 11:07:09.611018    97 net.cpp:375] Memory required for data: 8086804
[1] I0427 11:07:09.611021    97 layer_factory.hpp:114] Creating layer loss
[1] I0427 11:07:09.611028    97 net.cpp:265] Creating Layer loss
[1] I0427 11:07:09.611032    97 net.cpp:1264] loss <- ip2_ip2_0_split_1
[1] I0427 11:07:09.611037    97 net.cpp:1264] loss <- label_mnist_1_split_1
[1] I0427 11:07:09.611042    97 net.cpp:1238] loss -> loss
[1] I0427 11:07:09.611052    97 layer_factory.hpp:114] Creating layer loss
[1] I0427 11:07:09.611073    97 net.cpp:360] Setting up loss
[1] I0427 11:07:09.611081    97 net.cpp:367] Top shape: (1)
[1] I0427 11:07:09.611084    97 net.cpp:370]     with loss weight 0.5
[1] I0427 11:07:09.611091    97 net.cpp:375] Memory required for data: 8086808
[1] I0427 11:07:09.611095    97 net.cpp:437] loss needs backward computation.
[1] I0427 11:07:09.611100    97 net.cpp:439] accuracy does not need backward computation.
[1] I0427 11:07:09.611105    97 net.cpp:437] ip2_ip2_0_split needs backward computation.
[1] I0427 11:07:09.611109    97 net.cpp:437] ip2 needs backward computation.
[1] I0427 11:07:09.611114    97 net.cpp:437] relu1 needs backward computation.
[1] I0427 11:07:09.611116    97 net.cpp:437] ip1 needs backward computation.
[1] I0427 11:07:09.611120    97 net.cpp:437] pool2 needs backward computation.
[1] I0427 11:07:09.611124    97 net.cpp:437] conv2 needs backward computation.
[1] I0427 11:07:09.611129    97 net.cpp:437] pool1 needs backward computation.
[1] I0427 11:07:09.611132    97 net.cpp:437] conv1 needs backward computation.
[1] I0427 11:07:09.611136    97 net.cpp:439] label_mnist_1_split does not need backward computation.
[1] I0427 11:07:09.611141    97 net.cpp:439] mnist does not need backward computation.
[1] I0427 11:07:09.611147    97 net.cpp:481] This network produces output accuracy
[1] I0427 11:07:09.611150    97 net.cpp:481] This network produces output loss
[1] I0427 11:07:09.611165    97 net.cpp:521] Network initialization done.
[1] I0427 11:07:09.611233    97 solver.cpp:121] Solver scaffolding done.
[0] I0427 11:07:09.608055   125 net.cpp:360] Setting up ip1
[0] I0427 11:07:09.608070   125 net.cpp:367] Top shape: 100 500 (50000)
[0] I0427 11:07:09.608073   125 net.cpp:375] Memory required for data: 7874800
[0] I0427 11:07:09.608079   125 layer_factory.hpp:114] Creating layer relu1
[0] I0427 11:07:09.608090   125 net.cpp:265] Creating Layer relu1
[0] I0427 11:07:09.608093   125 net.cpp:1264] relu1 <- ip1
[0] I0427 11:07:09.608098   125 net.cpp:1225] relu1 -> ip1 (in-place)
[0] I0427 11:07:09.608114   125 net.cpp:360] Setting up relu1
[0] I0427 11:07:09.608117   125 net.cpp:367] Top shape: 100 500 (50000)
[0] I0427 11:07:09.608121   125 net.cpp:375] Memory required for data: 8074800
[0] I0427 11:07:09.608141   125 layer_factory.hpp:114] Creating layer ip2
[0] I0427 11:07:09.608151   125 net.cpp:265] Creating Layer ip2
[0] I0427 11:07:09.608155   125 net.cpp:1264] ip2 <- ip1
[0] I0427 11:07:09.608160   125 net.cpp:1238] ip2 -> ip2
[0] I0427 11:07:09.608194   125 net.cpp:360] Setting up ip2
[0] I0427 11:07:09.608201   125 net.cpp:367] Top shape: 100 10 (1000)
[0] I0427 11:07:09.608203   125 net.cpp:375] Memory required for data: 8078800
[0] I0427 11:07:09.608207   125 layer_factory.hpp:114] Creating layer ip2_ip2_0_split
[0] I0427 11:07:09.608213   125 net.cpp:265] Creating Layer ip2_ip2_0_split
[0] I0427 11:07:09.608216   125 net.cpp:1264] ip2_ip2_0_split <- ip2
[0] I0427 11:07:09.608222   125 net.cpp:1238] ip2_ip2_0_split -> ip2_ip2_0_split_0
[0] I0427 11:07:09.608227   125 net.cpp:1238] ip2_ip2_0_split -> ip2_ip2_0_split_1
[0] I0427 11:07:09.608235   125 net.cpp:360] Setting up ip2_ip2_0_split
[0] I0427 11:07:09.608238   125 net.cpp:367] Top shape: 100 10 (1000)
[0] I0427 11:07:09.608242   125 net.cpp:367] Top shape: 100 10 (1000)
[0] I0427 11:07:09.608245   125 net.cpp:375] Memory required for data: 8086800
[0] I0427 11:07:09.608248   125 layer_factory.hpp:114] Creating layer accuracy
[0] I0427 11:07:09.608258   125 net.cpp:265] Creating Layer accuracy
[0] I0427 11:07:09.608261   125 net.cpp:1264] accuracy <- ip2_ip2_0_split_0
[0] I0427 11:07:09.608266   125 net.cpp:1264] accuracy <- label_mnist_1_split_0
[0] I0427 11:07:09.608273   125 net.cpp:1238] accuracy -> accuracy
[0] I0427 11:07:09.608281   125 net.cpp:360] Setting up accuracy
[0] I0427 11:07:09.608286   125 net.cpp:367] Top shape: (1)
[0] I0427 11:07:09.608289   125 net.cpp:375] Memory required for data: 8086804
[0] I0427 11:07:09.608292   125 layer_factory.hpp:114] Creating layer loss
[0] I0427 11:07:09.608299   125 net.cpp:265] Creating Layer loss
[0] I0427 11:07:09.608304   125 net.cpp:1264] loss <- ip2_ip2_0_split_1
[0] I0427 11:07:09.608307   125 net.cpp:1264] loss <- label_mnist_1_split_1
[0] I0427 11:07:09.608312   125 net.cpp:1238] loss -> loss
[0] I0427 11:07:09.608317   125 layer_factory.hpp:114] Creating layer loss
[0] I0427 11:07:09.608337   125 net.cpp:360] Setting up loss
[0] I0427 11:07:09.608345   125 net.cpp:367] Top shape: (1)
[0] I0427 11:07:09.608347   125 net.cpp:370]     with loss weight 0.5
[0] I0427 11:07:09.608353   125 net.cpp:375] Memory required for data: 8086808
[0] I0427 11:07:09.608356   125 net.cpp:437] loss needs backward computation.
[0] I0427 11:07:09.608361   125 net.cpp:439] accuracy does not need backward computation.
[0] I0427 11:07:09.608368   125 net.cpp:437] ip2_ip2_0_split needs backward computation.
[0] I0427 11:07:09.608371   125 net.cpp:437] ip2 needs backward computation.
[0] I0427 11:07:09.608373   125 net.cpp:437] relu1 needs backward computation.
[0] I0427 11:07:09.608376   125 net.cpp:437] ip1 needs backward computation.
[0] I0427 11:07:09.608379   125 net.cpp:437] pool2 needs backward computation.
[0] I0427 11:07:09.608381   125 net.cpp:437] conv2 needs backward computation.
[0] I0427 11:07:09.608384   125 net.cpp:437] pool1 needs backward computation.
[0] I0427 11:07:09.608388   125 net.cpp:437] conv1 needs backward computation.
[0] I0427 11:07:09.608392   125 net.cpp:439] label_mnist_1_split does not need backward computation.
[0] I0427 11:07:09.608395   125 net.cpp:439] mnist does not need backward computation.
[0] I0427 11:07:09.608397   125 net.cpp:481] This network produces output accuracy
[0] I0427 11:07:09.608399   125 net.cpp:481] This network produces output loss
[0] I0427 11:07:09.608410   125 net.cpp:521] Network initialization done.
[1] I0427 11:07:09.612504    97 caffe.cpp:325] Configuring multinode setup
[0] I0427 11:07:09.608490   125 solver.cpp:121] Solver scaffolding done.
[1] I0427 11:07:09.612534    97 caffe.cpp:328] Starting Multi-node Optimization in MLSL environment
[0] I0427 11:07:09.608570   125 caffe.cpp:325] Configuring multinode setup
[1] W0427 11:07:09.612536    97 multi_sync.hpp:191] RUN: PER LAYER TIMINGS ARE DISABLED, FORWARD OVERLAP OPTIMIZATION IS ENABLED, WEIGHT GRADIENT COMPRESSION IS DISABLED, SINGLE DB SPLITTING IS DISABLED
[0] I0427 11:07:09.608588   125 caffe.cpp:328] Starting Multi-node Optimization in MLSL environment
[1] I0427 11:07:09.612552    97 multi_sync.hpp:134] synchronize_params: bcast
[0] W0427 11:07:09.608592   125 multi_sync.hpp:191] RUN: PER LAYER TIMINGS ARE DISABLED, FORWARD OVERLAP OPTIMIZATION IS ENABLED, WEIGHT GRADIENT COMPRESSION IS DISABLED, SINGLE DB SPLITTING IS DISABLED
[0] I0427 11:07:09.608605   125 multi_sync.hpp:134] synchronize_params: bcast
[0] I0427 11:07:09.609779   125 solver.cpp:397] Solving LeNet
[0] I0427 11:07:09.609793   125 solver.cpp:398] Learning Rate Policy: inv
[0] I0427 11:07:09.609819   125 multi_sync.hpp:134] synchronize_params: bcast
[1] I0427 11:07:09.614929    97 solver.cpp:397] Solving LeNet
[1] I0427 11:07:09.614940    97 solver.cpp:398] Learning Rate Policy: inv
[1] I0427 11:07:09.614959    97 multi_sync.hpp:134] synchronize_params: bcast
[0] I0427 11:07:09.611517   125 solver.cpp:474] Iteration 0, Testing net (#0)
[1] I0427 11:07:09.617259    97 solver.cpp:474] Iteration 0, Testing net (#0)
[0] I0427 11:07:09.798406   125 solver.cpp:563]     Test net output #0: accuracy = 0.1318
[0] I0427 11:07:09.798449   125 solver.cpp:563]     Test net output #1: loss = 2.41961 (* 1 = 2.41961 loss)
[0] I0427 11:07:09.814437   125 solver.cpp:312] Iteration 0, loss = 2.40042
[0] I0427 11:07:09.814477   125 solver.cpp:333]     Train net output #0: loss = 2.40042 (* 1 = 2.40042 loss)
[0] I0427 11:07:09.814501   125 sgd_solver.cpp:215] Iteration 0, lr = 0.01
[1] I0427 11:07:09.820175    97 solver.cpp:312] Iteration 0, loss = 2.40914
[1] I0427 11:07:09.820214    97 solver.cpp:333]     Train net output #0: loss = 2.40914 (* 1 = 2.40914 loss)
[1] I0427 11:07:09.820233    97 sgd_solver.cpp:215] Iteration 0, lr = 0.01
[1] I0427 11:07:10.099892    97 solver.cpp:707] Snapshot begin
[0] I0427 11:07:10.096343   125 solver.cpp:707] Snapshot begin
[1] I0427 11:07:10.102345    97 solver.cpp:734] Snapshot end
[1] I0427 11:07:10.102360    97 solver.cpp:443] Optimization Done.
[1] I0427 11:07:10.102368    97 caffe.cpp:345] Optimization Done.
[0] I0427 11:07:10.098636   125 solver.cpp:769] Snapshotting to binary proto file examples/mnist/lenet_mlsl_iter_50.caffemodel
[0] I0427 11:07:10.102581   125 sgd_solver.cpp:754] Snapshotting solver state to binary proto file examples/mnist/lenet_mlsl_iter_50.solverstate
[0] I0427 11:07:10.105306   125 solver.cpp:734] Snapshot end
[0] I0427 11:07:10.105319   125 solver.cpp:443] Optimization Done.
[0] I0427 11:07:10.105329   125 caffe.cpp:345] Optimization Done.

real    0m0.856s
user    0m0.043s
sys     0m0.033s
Result folder: /opt/caffe/result-20180427110708

Log without setting it:

root@jfz1r04h17:/opt/caffe# ./scripts/run_intelcaffe.sh --hostfile hosts --solver examples/mnist/lenet_solver_mlsl.prototxt --network tcp --netmask enp134s0f0

CPUs with optimal settings:
    Intel Xeon E7-88/48xx, E5-46/26/16xx, E3-12xx, D15/D-15 (Broadwell)
    Intel Xeon Phi 7210/30/50/90 (Knights Landing)
    Intel Xeon Platinum 81/61/51/41/31xx (Skylake)

Settings:
    CPU: skx
    Host file: hosts
    Running mode: train
    Benchmark: none
    Debug option: off
    Engine:
    Number of MLSL servers: -1
        -1: selected automatically according to CPU model.
            BDW/SKX: 2, KNL: 4
    Solver file: examples/mnist/lenet_solver_mlsl.prototxt
    LMDB data source: examples/mnist/mnist_train_lmdb
    LMDB data source: examples/mnist/mnist_test_lmdb
    Network: tcp
    Netmask for TCP network: enp134s0f0
    NUMA configuration: Flat mode.
Create result directory: /opt/caffe/result-20180427111108
    Number of nodes: 2
MLSL_NUM_SERVERS: 2
MLSL_SERVER_AFFINITY: 6,7
Pin internal threads to: 70,71
Number of OpenMP threads: 34
Run caffe with 2 nodes...
Warning: cannot find sensors
[0] [0] MPI startup(): Intel(R) MPI Library, Version 2018 Update 1  Build 20171011 (id: 17941)
[0] [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] [0] MPI startup(): Multi-threaded optimized library
[1] [1] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h19
[1] [1] MPI startup(): tcp data transfer mode
[0] [0] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h18
[0] [0] MPI startup(): tcp data transfer mode
[0] [0] MPI startup(): Device_reset_idx=5
[0] [0] MPI startup(): Allgather: 4: 27306-38912 & 0-2
[0] [0] MPI startup(): Allgather: 4: 78064-294912 & 0-2
[0] [0] MPI startup(): Allgather: 3: 0-27306 & 0-2
[0] [0] MPI startup(): Allgather: 3: 38912-78064 & 0-2
[0] [0] MPI startup(): Allgather: 3: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allgather: 1: 0-7 & 3-4
[0] [0] MPI startup(): Allgather: 1: 9-4607 & 3-4
[0] [0] MPI startup(): Allgather: 1: 66622-461338 & 3-4
[0] [0] MPI startup(): Allgather: 3: 9081-26350 & 3-4
[0] [0] MPI startup(): Allgather: 3: 461338-2692119 & 3-4
[0] [0] MPI startup(): Allgather: 4: 7-9 & 3-4
[0] [0] MPI startup(): Allgather: 4: 4607-9081 & 3-4
[0] [0] MPI startup(): Allgather: 4: 26350-66622 & 3-4
[0] [0] MPI startup(): Allgather: 4: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allgather: 2: 1-1 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 2-3 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 4-5 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 6-26 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 27-98 & 5-2147483647
[0] [0] MPI startup(): Allgather: 3: 99-1029 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 1030-5572 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 5573-15186 & 5-2147483647
[0] [0] MPI startup(): Allgather: 2: 15187-33976 & 5-2147483647
[0] [0] MPI startup(): Allgather: 1: 33977-74391 & 5-2147483647
[0] [0] MPI startup(): Allgather: 3: 74392-131842 & 5-2147483647
[0] [0] MPI startup(): Allgather: 4: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allgatherv: 1: 0-2 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 2-7 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 7-49 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 49-113 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 113-149 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 149-915 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 915-1614 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 1614-3296 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 3296-5670 & 3-4
[0] [0] MPI startup(): Allgatherv: 1: 5670-10998 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 10998-185966 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 185966-381166 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 381166-1597083 & 3-4
[0] [0] MPI startup(): Allgatherv: 3: 1597083-2998114 & 3-4
[0] [0] MPI startup(): Allgatherv: 4: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allgatherv: 2: 0-47 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 1: 47-103 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 103-438 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 438-757 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 757-1453 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 1453-3133 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 3133-6762 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 2: 6762-10802 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 10802-49917 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 49917-309996 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 4: 309996-3739157 & 5-2147483647
[0] [0] MPI startup(): Allgatherv: 3: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Allreduce: 1: 804-1535 & 0-2
[0] [0] MPI startup(): Allreduce: 1: 2061-17116 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 17116-37171 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 344562-1048576 & 0-2
[0] [0] MPI startup(): Allreduce: 3: 37171-344562 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 0-804 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 1535-2061 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 1048576-3026207 & 0-2
[0] [0] MPI startup(): Allreduce: 4: 3026207-8388608 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 8388609-8635416 & 0-2
[0] [0] MPI startup(): Allreduce: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Allreduce: 7: 0-6 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 6-11 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 11-49 & 3-4
[0] [0] MPI startup(): Allreduce: 6: 49-321 & 3-4
[0] [0] MPI startup(): Allreduce: 2: 321-720 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 720-1375 & 3-4
[0] [0] MPI startup(): Allreduce: 1: 1375-173904 & 3-4
[0] [0] MPI startup(): Allreduce: 2: 173904-318383 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 318383-1512039 & 3-4
[0] [0] MPI startup(): Allreduce: 6: 1512039-2561761 & 3-4
[0] [0] MPI startup(): Allreduce: 4: 2561762-8388608 & 3-4
[0] [0] MPI startup(): Allreduce: 7: 8388609-10618873 & 3-4
[0] [0] MPI startup(): Allreduce: 8: 0-2147483647 & 3-4
[0] [0] MPI startup(): Allreduce: 1: 0-11 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 11-24 & 5-8
[0] [0] MPI startup(): Allreduce: 6: 24-42 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 42-107 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 107-178 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 178-310 & 5-8
[0] [0] MPI startup(): Allreduce: 2: 310-594 & 5-8
[0] [0] MPI startup(): Allreduce: 5: 594-4431 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 4431-54874 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 54874-91696 & 5-8
[0] [0] MPI startup(): Allreduce: 6: 91696-175538 & 5-8
[0] [0] MPI startup(): Allreduce: 4: 175538-383770 & 5-8
[0] [0] MPI startup(): Allreduce: 2: 383770-684262 & 5-8
[0] [0] MPI startup(): Allreduce: 3: 0-2147483647 & 5-8
[0] [0] MPI startup(): Allreduce: 1: 0-11 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 11-24 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 6: 24-42 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 42-107 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 107-178 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 178-310 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 2: 310-594 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 5: 594-4431 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 1: 4431-54874 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 54874-91696 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 6: 91696-175538 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 4: 175538-383770 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 2: 383770-32006608 & 9-2147483647
[0] [0] MPI startup(): Allreduce: 3: 0-2147483647 & 9-2147483647
[0] [0] MPI startup(): Alltoall: 3: 0-129493 & 0-2
[0] [0] MPI startup(): Alltoall: 3: 1080889-3453431 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 129493-1080889 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Alltoall: 2: 0-2147483647 & 3-4
[0] [0] MPI startup(): Alltoall: 1: 1-64 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 2: 65-572235 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 4: 572236-1736997 & 5-2147483647
[0] [0] MPI startup(): Alltoall: 3: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Alltoallv: 2: 0-2147483647 & 3-4
[0] [0] MPI startup(): Alltoallv: 2: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Barrier: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Barrier: 6: 0-2147483647 & 3-4
[0] [0] MPI startup(): Barrier: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Bcast: 7: 0-8 & 0-2
[0] [0] MPI startup(): Bcast: 7: 24-64 & 0-2
[0] [0] MPI startup(): Bcast: 7: 11264-52186 & 0-2
[0] [0] MPI startup(): Bcast: 7: 112045-131072 & 0-2
[0] [0] MPI startup(): Bcast: 7: 1048576-2097152 & 0-2
[0] [0] MPI startup(): Bcast: 1: 8-24 & 0-2
[0] [0] MPI startup(): Bcast: 1: 64-11264 & 0-2
[0] [0] MPI startup(): Bcast: 1: 52186-112045 & 0-2
[0] [0] MPI startup(): Bcast: 1: 131072-1048576 & 0-2
[0] [0] MPI startup(): Bcast: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Bcast: 1: 1-1 & 3-4
[0] [0] MPI startup(): Bcast: 5: 2-3 & 3-4
[0] [0] MPI startup(): Bcast: 1: 4-5 & 3-4
[0] [0] MPI startup(): Bcast: 6: 6-11 & 3-4
[0] [0] MPI startup(): Bcast: 5: 12-24 & 3-4
[0] [0] MPI startup(): Bcast: 4: 25-141 & 3-4
[0] [0] MPI startup(): Bcast: 7: 142-370 & 3-4
[0] [0] MPI startup(): Bcast: 3: 371-680 & 3-4
[0] [0] MPI startup(): Bcast: 4: 681-3894 & 3-4
[0] [0] MPI startup(): Bcast: 1: 3895-4494 & 3-4
[0] [0] MPI startup(): Bcast: 7: 4495-14778 & 3-4
[0] [0] MPI startup(): Bcast: 4: 14779-18223 & 3-4
[0] [0] MPI startup(): Bcast: 7: 18224-36738 & 3-4
[0] [0] MPI startup(): Bcast: 3: 0-2147483647 & 3-4
[0] [0] MPI startup(): Bcast: 1: 0-10 & 5-2147483647
[0] [0] MPI startup(): Bcast: 1: 175-16799 & 5-2147483647
[0] [0] MPI startup(): Bcast: 6: 10-32 & 5-2147483647
[0] [0] MPI startup(): Bcast: 6: 32-175 & 5-2147483647
[0] [0] MPI startup(): Bcast: 7: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Gather: 2: 73643-172031 & 0-2
[0] [0] MPI startup(): Gather: 3: 0-853 & 0-2
[0] [0] MPI startup(): Gather: 3: 54613-73643 & 0-2
[0] [0] MPI startup(): Gather: 3: 262144-524288 & 0-2
[0] [0] MPI startup(): Gather: 1: 853-54613 & 0-2
[0] [0] MPI startup(): Gather: 1: 172031-262144 & 0-2
[0] [0] MPI startup(): Gather: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Gather: 2: 34148-129691 & 3-2147483647
[0] [0] MPI startup(): Gather: 2: 503316-2506634 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 0-34148 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 129691-503316 & 3-2147483647
[0] [0] MPI startup(): Gather: 3: 0-2147483647 & 3-2147483647
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Gatherv: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 4: 0-5 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 5-26 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 26-47 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 47-98 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 98-188 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 188-362 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 2: 362-588 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 588-1951 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 3: 1951-11702 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 11702-23138 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 23138-58229 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 1: 58229-191964 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 2: 191964-2656092 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 5: 0-2147483647 & 0-2
[0] [0] MPI startup(): Reduce_scatter: 4: 0-4 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 4-12 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 12-45 & 3-4
[1] [1] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=4) Fabric(intra=6 inter=6 flags=0x0)
[0] [0] MPI startup(): Reduce_scatter: 1: 45-85 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 85-391 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 391-596 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 2: 596-1927 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 1927-2286 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 2286-7442 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 7442-10726 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 10726-45950 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 45950-101084 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 101084-159597 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 159597-423110 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 2: 423110-578734 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 578734-1329975 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 1: 1329975-4146461 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 3: 0-2147483647 & 3-4
[0] [0] MPI startup(): Reduce_scatter: 5: 0-5 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 5-28 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 28-50 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 50-197 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 197-721 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 2: 721-3207 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 3207-5980 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 5980-11416 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 11416-104215 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 104215-277330 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 3: 277330-630522 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 1: 630522-2659184 & 5-2147483647
[0] [0] MPI startup(): Reduce_scatter: 5: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 4-8 & 0-2
[0] [0] MPI startup(): Reduce: 3: 9-29 & 0-2
[0] [0] MPI startup(): Reduce: 2: 30-37 & 0-2
[0] [0] MPI startup(): Reduce: 3: 38-215 & 0-2
[0] [0] MPI startup(): Reduce: 2: 216-315 & 0-2
[0] [0] MPI startup(): Reduce: 5: 316-775 & 0-2
[0] [0] MPI startup(): Reduce: 2: 776-4045 & 0-2
[0] [0] MPI startup(): Reduce: 4: 4-6 & 3-4
[0] [0] MPI startup(): Reduce: 3: 7-11 & 3-4
[0] [0] MPI startup(): Reduce: 6: 12-16 & 3-4
[0] [0] MPI startup(): Reduce: 4: 17-34 & 3-4
[0] [0] MPI startup(): Reduce: 2: 35-99 & 3-4
[0] [0] MPI startup(): Reduce: 4: 100-230 & 3-4
[0] [0] MPI startup(): Reduce: 6: 231-275 & 3-4
[0] [0] MPI startup(): Reduce: 1: 276-1040 & 3-4
[0] [0] MPI startup(): Reduce: 3: 1041-3895 & 3-4
[0] [0] MPI startup(): Reduce: 6: 3896-4326 & 3-4
[0] [0] MPI startup(): Reduce: 3: 4327-10163 & 3-4
[0] [0] MPI startup(): Reduce: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Reduce: 2: 4-26 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 27-39 & 5-2147483647
[0] [0] MPI startup(): Reduce: 2: 40-230 & 5-2147483647
[0] [0] MPI startup(): Reduce: 3: 231-257 & 5-2147483647
[0] [0] MPI startup(): Reduce: 2: 258-718 & 5-2147483647
[0] [0] MPI startup(): Reduce: 3: 719-2436 & 5-2147483647
[0] [0] MPI startup(): Reduce: 4: 2437-6344 & 5-2147483647
[0] [0] MPI startup(): Reduce: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647
[0] [0] MPI startup(): Scatter: 1: 0-1 & 0-2
[0] [0] MPI startup(): Scatter: 1: 4-12 & 0-2
[0] [0] MPI startup(): Scatter: 1: 19-2048 & 0-2
[0] [0] MPI startup(): Scatter: 3: 2048-85701 & 0-2
[0] [0] MPI startup(): Scatter: 3: 165767-466939 & 0-2
[0] [0] MPI startup(): Scatter: 3: 524288-2336552 & 0-2
[0] [0] MPI startup(): Scatter: 2: 1-4 & 0-2
[0] [0] MPI startup(): Scatter: 2: 12-19 & 0-2
[0] [0] MPI startup(): Scatter: 2: 85701-165767 & 0-2
[0] [0] MPI startup(): Scatter: 2: 466939-524288 & 0-2
[0] [0] MPI startup(): Scatter: 2: 0-2147483647 & 0-2
[0] [0] MPI startup(): Scatter: 3: 0-1909200 & 3-2147483647
[0] [0] MPI startup(): Scatter: 2: 0-2147483647 & 3-2147483647
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 0-2
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 3-4
[0] [0] MPI startup(): Scatterv: 1: 0-2147483647 & 5-2147483647
[0] [0] MPI startup(): Rank    Pid      Node name   Pin cpu
[0] [0] MPI startup(): 0       221      jfz1r04h18  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
[0]                                   30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
[0]                                   ,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71}
[0] [0] MPI startup(): 1       193      jfz1r04h19  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
[0]                                   30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
[0]                                   ,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71}
[0] [0] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=4) Fabric(intra=6 inter=6 flags=0x0)
[0] [0] MPI startup(): I_MPI_COLL_INTRANODE=pt2pt
[0] [0] MPI startup(): I_MPI_DEBUG=6
[0] [0] MPI startup(): I_MPI_FABRICS=tcp
[0] [0] MPI startup(): I_MPI_FALLBACK=0
[0] [0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=hfi1_0:0,i40iw0:0,i40iw1:0
[0] [0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] [0] MPI startup(): I_MPI_PIN_MAPPING=1:0 0
[0] [0] MPI startup(): I_MPI_TCP_NETMASK=enp134s0f0
[0] [0] ckpt_restart(): The real interface being used for tcp is enp134s0f0 and interface hostname is jfz1r04h18
chuanqi129 commented 6 years ago

@zhang-xin Per ssh issue, did you have modify the file ~/.ssh/config as

Host * Port 10010

I mean the ~/.ssh/config just contain those two lines and needn't modify anymore. If not, could you try again under not comment the test_ssh_config function?

xzhangxa commented 6 years ago

@chuanqi129 thanks, it turns out to be MLSL issue, MLSL doesn't work well with IP in ~/.ssh/config, using only hostname works.

It's more like a MLSL issue not Intel Caffe's. I'll close this issue, thanks for your help!