VeriSilicon / TIM-VX

VeriSilicon Tensor Interface Module
Other
224 stars 84 forks source link

Is SpatialTransformer on rv1126 supported? #131

Closed sky-fun closed 3 years ago

sky-fun commented 3 years ago

I try to run SpatialTransformer(mxnet) operator on rv1126 using Tengine, but the Status of "SpatialTransformer" operator in tim-vx is "InternalOnly", and no TIM-VX API implementation.So i can't find a way to add SpatialTransformer npu support to Tengine, i wondered that:

  1. Is SpatialTransformer(mxnet) operator on rv1126 npu supported?
  2. If it is supported, how can i add it to Tengine?
thezha commented 3 years ago

SpatialTransformer is supported internally, we will add a TIM-VX API for it, then Tengine can integrate the new API.

thezha commented 3 years ago

https://github.com/VeriSilicon/TIM-VX/pull/132

sky-fun commented 3 years ago

@thezha i add the code to Tengine, and run example on x86_64 simulator platform, got error like Kernel "com.vivantecorp.extension.vxcTransform_setupThres_F16toF16" does not exist Kernel "com.vivantecorp.extension.vxcTransform_Gemm_F16toF16" does not exist Kernel "com.vivantecorp.extension.vxcTransform_InterP_F16toF16" does not exist

Do i need to change something about the dependence ?

thezha commented 3 years ago

Did you specify the runtime compiler header file via "export VIVANTE_SDK_DIR"?

export VIVANTE_SDK_DIR=pwd/prebuilt-sdk/x86_64_linux

sky-fun commented 3 years ago

@thezha yes, i did add "export VIVANTE_SDK_DIR=pwd/prebuilt-sdk/x86_64_linux" before i run the example.

thezha commented 3 years ago

@sky-fun These messages are not errors, they are information. Does your case run OK? Please supply more logs if possible. Kernel "com.vivantecorp.extension.vxcTransform_setupThres_F16toF16" does not exist Kernel "com.vivantecorp.extension.vxcTransform_Gemm_F16toF16" does not exist Kernel "com.vivantecorp.extension.vxcTransform_InterP_F16toF16" does not exist

sky-fun commented 3 years ago

@thezha i can run my demo sucess with an unexpect result, so i use "export VIV_VX_DEBUG_LEVEL=4" to see the those informations, then i thought it was error bucause the phrase: "does not exist", and it may not run SpatialTransformer sucess on tim-vx. downblew is all my logs:


#productname=VSI SIMULATOR, pid=0xc5
prev_ptrs = 0x55b69df8c080
prev_ptrs = 0x55b69df8cac0
Kernel "com.vivantecorp.extension.vxcTransform_setupThres_F16toF16" does not existKernel "com.vivantecorp.extension.vxcTransform_Gemm_F16toF16" does not existKernel "com.vivantecorp.extension.vxcTransform_InterP_F16toF16" does not exist---------------------------Begin VerifyTiling -------------------------
AXI-SRAM = 0 Bytes VIP-SRAM = 260096 Bytes SWTILING_PHASE_FEATURES[1, 1, 1]
  0 TP [( 112  128    3 1,    43008, 0x0x55b69ded6710(0x0x55b69ded6710, 0x(nil)) ->  112  128    3 1,    86016, 0x0x55b69dfe5d10(0x0x55b69dfe5d10, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] C[ 11]
  1 NN [( 112  128    3 1,    43008, 0x0x55b69ded6710(0x0x55b69ded6710, 0x(nil)) ->   56   64    3 1,    10752, 0x0x55b69dfce460(0x0x55b69dfce460, 0x(nil))) k(2 2    1,      256) pad(0 0) pool(2 2, 2 2)] C[  2]
  2 NN [(  56   64    3 1,    10752, 0x0x55b69dfce460(0x0x55b69dfce460, 0x(nil)) ->   27   31   24 1,    20088, 0x0x55b69dfd3de0(0x0x55b69dfd3de0, 0x(nil))) k(3 3    3,     1024) pad(0 0) pool(2 2, 2 2)] P[  1] C[  3]
  3 NN [(  27   31   24 1,    20088, 0x0x55b69dfd3de0(0x0x55b69dfd3de0, 0x(nil)) ->   25   29   24 1,    17400, 0x0x55b69dfd8ae0(0x0x55b69dfd8ae0, 0x(nil))) k(3 3   24,     5888) pad(0 0) pool(0 0, 1 1)] P[  2] C[  4]
  4 TP [(  25   29   24 1,    17400, 0x0x55b69dfd8ae0(0x0x55b69dfd8ae0, 0x(nil)) ->   12   14   24 1,     4032, 0x0x55b69dfd9c50(0x0x55b69dfd9c50, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(2 2, 2 2)] P[  3] C[  5]
  5 NN [(  12   14   24 1,     4032, 0x0x55b69dfd9c50(0x0x55b69dfd9c50, 0x(nil)) ->    5    6   48 1,     1440, 0x0x55b69dfcd6e0(0x0x55b69dfcd6e0, 0x(nil))) k(3 3   24,    11520) pad(0 0) pool(2 2, 2 2)] P[  4] C[  6]
  6 TP [(1440    1    1 1,     1440, 0x0x55b69dfcd6e0(0x0x55b69dfcd6e0, 0x(nil)) ->   32    1    1 1,       32, 0x0x55b69dfe2ed0(0x0x55b69dfe2ed0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[  5] C[  7]
  7 TP [(  32    1    1 1,       32, 0x0x55b69dfe2ed0(0x0x55b69dfe2ed0, 0x(nil)) ->    6    1    1 1,        6, 0x0x55b69dfe4040(0x0x55b69dfe4040, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[  6] C[  8]
  8 TP [(   6    1    1 1,        6, 0x0x55b69dfe4040(0x0x55b69dfe4040, 0x(nil)) ->    6    1    1 1,       12, 0x0x55b69dfe6240(0x0x55b69dfe6240, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[  7] C[  9]
  9 SH [(   6    1    1 1,       12, 0x0x55b69dff0750(0x0x55b69dff0750, 0x(nil)) ->    6    1    1 1,       12, 0x0x55b69dfe7f90(0x0x55b69dfe7f90, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[  8] C[ 10]
 10 SH [(   6    1    1 1,       12, 0x0x55b69dfe7f90(0x0x55b69dfe7f90, 0x(nil)) ->  224  128    1 1,    57344, 0x0x55b69dfe8da0(0x0x55b69dfe8da0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[  9] C[ 11]
 11 SH [( 112  128    3 1,    86016, 0x0x55b69dfe5d10(0x0x55b69dfe5d10, 0x(nil)) ->  112  128    3 1,    86016, 0x0x55b69dfe6900(0x0x55b69dfe6900, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[  0, 10] C[ 12]
 12 TP [( 112  128    3 1,    86016, 0x0x55b69dfe6900(0x0x55b69dfe6900, 0x(nil)) ->  112  128    3 1,    43008, 0x0x55b69dfe5790(0x0x55b69dfe5790, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 11] C[ 13]
 13 NN [( 112  128    3 1,    43008, 0x0x55b69dfe5790(0x0x55b69dfe5790, 0x(nil)) ->  112  128   24 1,   344064, 0x0x55b69e06fe60(0x0x55b69e06fe60, 0x(nil))) k(3 3    3,     1024) pad(1 1) pool(0 0, 1 1)] P[ 12] C[ 14, 17]
 14 TP [( 112  128   24 1,   344064, 0x0x55b69e06fe60(0x0x55b69e06fe60, 0x(nil)) ->   57   65   96 1,   355680, 0x0x55b6a0375d70(0x0x55b6a0375d70, 0x(nil))) k(0 0    0,        0) pad(1 1) pool(0 0, 1 1)] P[ 13] C[ 15]
 15 NN [(  57   65   96 1,   355680, 0x0x55b6a0375d70(0x0x55b6a0375d70, 0x(nil)) ->   56   64   24 1,    86016, 0x0x55b69e02bc30(0x0x55b69e02bc30, 0x(nil))) k(2 2   96,    10496) pad(0 0) pool(0 0, 1 1)] P[ 14] C[ 16]
 16 NN [(  56   64   24 1,    86016, 0x0x55b69e02bc30(0x0x55b69e02bc30, 0x(nil)) ->   56   64   24 1,   172032, 0x0x55b69e08efc0(0x0x55b69e08efc0, 0x(nil))) k(3 3   24,     5888) pad(1 1) pool(0 0, 1 1)] P[ 15] C[ 18]
 17 NN [( 112  128   24 1,   344064, 0x0x55b69e06fe60(0x0x55b69e06fe60, 0x(nil)) ->   56   64   24 1,   172032, 0x0x55b6a1ee6fc0(0x0x55b69e08efc0, 0x0x15000)) k(1 1   24,      768) pad(0 0) pool(2 2, 2 2)] P[ 13] C[ 18]
 18 NN [(  56   64   48 1,   172032, 0x0x55b69e08efc0(0x0x55b69e08efc0, 0x(nil)) ->   56   64   24 1,   172032, 0x0x55b6a1eec5a0(0x0x55b69e0945a0, 0x0x15000)) k(1 1   48,     1536) pad(0 0) pool(0 0, 1 1)] P[ 16, 17] C[ 19, 22]
 19 TP [(  56   64   24 1,   172032, 0x0x55b6a1eec5a0(0x0x55b69e0945a0, 0x0x15000) ->   56   64   24 1,    86016, 0x0x55b69e058fe0(0x0x55b69e058fe0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 18] C[ 20]
 20 NN [(  56   64   24 1,    86016, 0x0x55b69e058fe0(0x0x55b69e058fe0, 0x(nil)) ->   56   64   24 1,    86016, 0x0x55b69e05c740(0x0x55b69e05c740, 0x(nil))) k(3 3   24,     5888) pad(1 1) pool(0 0, 1 1)] P[ 19] C[ 21]
 21 NN [(  56   64   24 1,    86016, 0x0x55b69e05c740(0x0x55b69e05c740, 0x(nil)) ->   56   64   24 1,   172032, 0x0x55b69e0945a0(0x0x55b69e0945a0, 0x(nil))) k(3 3   24,     5888) pad(1 1) pool(0 0, 1 1)] P[ 20] C[ 22]
 22 NN [(  56   64   48 1,   172032, 0x0x55b69e0945a0(0x0x55b69e0945a0, 0x(nil)) ->   56   64   24 1,    86016, 0x0x55b69dffdde0(0x0x55b69dffdde0, 0x(nil))) k(1 1   48,     1536) pad(0 0) pool(0 0, 1 1)] P[ 18, 21] C[ 23, 26]
 23 TP [(  56   64   24 1,    86016, 0x0x55b69dffdde0(0x0x55b69dffdde0, 0x(nil)) ->   29   33   96 1,    91872, 0x0x55b6a037aa70(0x0x55b6a037aa70, 0x(nil))) k(0 0    0,        0) pad(1 1) pool(0 0, 1 1)] P[ 22] C[ 24]
 24 NN [(  29   33   96 1,    91872, 0x0x55b6a037aa70(0x0x55b6a037aa70, 0x(nil)) ->   28   32   48 1,    43008, 0x0x55b69e0018f0(0x0x55b69e0018f0, 0x(nil))) k(2 2   96,    20992) pad(0 0) pool(0 0, 1 1)] P[ 23] C[ 25]
 25 NN [(  28   32   48 1,    43008, 0x0x55b69e0018f0(0x0x55b69e0018f0, 0x(nil)) ->   28   32   48 1,    86016, 0x0x55b69e0980f0(0x0x55b69e0980f0, 0x(nil))) k(3 3   48,    22784) pad(1 1) pool(0 0, 1 1)] P[ 24] C[ 27]
 26 NN [(  56   64   24 1,    86016, 0x0x55b69dffdde0(0x0x55b69dffdde0, 0x(nil)) ->   28   32   48 1,    86016, 0x0x55b69ffc40f0(0x0x55b69e0980f0, 0x0xa800)) k(1 1   24,     1536) pad(0 0) pool(2 2, 2 2)] P[ 22] C[ 27]
 27 NN [(  28   32   96 1,    86016, 0x0x55b69e0980f0(0x0x55b69e0980f0, 0x(nil)) ->   28   32   48 1,    86016, 0x0x55b69ffc7c40(0x0x55b69e09bc40, 0x0xa800)) k(1 1   96,     6144) pad(0 0) pool(0 0, 1 1)] P[ 25, 26] C[ 28, 31]
 28 TP [(  28   32   48 1,    86016, 0x0x55b69ffc7c40(0x0x55b69e09bc40, 0x0xa800) ->   28   32   48 1,    43008, 0x0x55b69e009150(0x0x55b69e009150, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 27] C[ 29]
 29 NN [(  28   32   48 1,    43008, 0x0x55b69e009150(0x0x55b69e009150, 0x(nil)) ->   28   32   48 1,    43008, 0x0x55b69e00cc60(0x0x55b69e00cc60, 0x(nil))) k(3 3   48,    23040) pad(1 1) pool(0 0, 1 1)] P[ 28] C[ 30]
 30 NN [(  28   32   48 1,    43008, 0x0x55b69e00cc60(0x0x55b69e00cc60, 0x(nil)) ->   28   32   48 1,    86016, 0x0x55b69e09bc40(0x0x55b69e09bc40, 0x(nil))) k(3 3   48,    23040) pad(1 1) pool(0 0, 1 1)] P[ 29] C[ 31]
 31 NN [(  28   32   96 1,    86016, 0x0x55b69e09bc40(0x0x55b69e09bc40, 0x(nil)) ->   28   32   48 1,    43008, 0x0x55b69e011970(0x0x55b69e011970, 0x(nil))) k(1 1   96,     6144) pad(0 0) pool(0 0, 1 1)] P[ 27, 30] C[ 32, 33]
 32 NN [(  28   32   48 1,    43008, 0x0x55b69e011970(0x0x55b69e011970, 0x(nil)) ->   14   16   64 1,    28672, 0x0x55b69eb03790(0x0x55b69e09f790, 0x0x3800)) k(1 1   48,     3584) pad(0 0) pool(2 2, 2 2)] P[ 31] C[ 36]
 33 TP [(  28   32   48 1,    43008, 0x0x55b69e011970(0x0x55b69e011970, 0x(nil)) ->   15   17  192 1,    48960, 0x0x55b6a037e410(0x0x55b6a037e410, 0x(nil))) k(0 0    0,        0) pad(1 1) pool(0 0, 1 1)] P[ 31] C[ 34]
 34 NN [(  15   17  192 1,    48960, 0x0x55b6a037e410(0x0x55b6a037e410, 0x(nil)) ->   14   16   64 1,    14336, 0x0x55b69e017fd0(0x0x55b69e017fd0, 0x(nil))) k(2 2  192,    56320) pad(0 0) pool(0 0, 1 1)] P[ 33] C[ 35]
 35 NN [(  14   16   64 1,    14336, 0x0x55b69e017fd0(0x0x55b69e017fd0, 0x(nil)) ->   14   16   64 1,    28672, 0x0x55b69e09f790(0x0x55b69e09f790, 0x(nil))) k(3 3   64,    40448) pad(1 1) pool(0 0, 1 1)] P[ 34] C[ 36]
 36 NN [(  14   16  128 1,    28672, 0x0x55b69e09f790(0x0x55b69e09f790, 0x(nil)) ->   14   16   64 1,    14336, 0x0x55b69e01bc90(0x0x55b69e01bc90, 0x(nil))) k(1 1  128,    10496) pad(0 0) pool(0 0, 1 1)] P[ 32, 35] C[ 37, 40]
 37 TP [(  14   16   64 1,    14336, 0x0x55b69e01bc90(0x0x55b69e01bc90, 0x(nil)) ->   14   16   64 1,    14336, 0x0x55b69e01cce0(0x0x55b69e01cce0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 36] C[ 38]
 38 NN [(  14   16   64 1,    14336, 0x0x55b69e01cce0(0x0x55b69e01cce0, 0x(nil)) ->   14   16   64 1,    14336, 0x0x55b69e0723c0(0x0x55b69e0723c0, 0x(nil))) k(3 3   64,    40704) pad(1 1) pool(0 0, 1 1)] P[ 37] C[ 39]
 39 NN [(  14   16   64 1,    14336, 0x0x55b69e0723c0(0x0x55b69e0723c0, 0x(nil)) ->   14   16   64 1,    14336, 0x0x55b69e073530(0x0x55b69e073530, 0x(nil))) k(3 3   64,    40448) pad(1 1) pool(0 0, 1 1)] P[ 38] C[ 40]
 40 SH [(14336    1    1 1,    14336, 0x0x55b69e073530(0x0x55b69e073530, 0x(nil)) -> 14336    1    1 1,    14336, 0x0x55b69e0770d0(0x0x55b69e0770d0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 39, 36] C[ 41, 42]
 41 NN [(  14   16   64 1,    14336, 0x0x55b69e0770d0(0x0x55b69e0770d0, 0x(nil)) ->    7    8  128 1,     7168, 0x0x55b69e078090(0x0x55b69e078090, 0x(nil))) k(1 1   64,     9472) pad(0 0) pool(2 2, 2 2)] P[ 40] C[ 45]
 42 TP [(  14   16   64 1,    14336, 0x0x55b69e0770d0(0x0x55b69e0770d0, 0x(nil)) ->    8    9  256 1,    18432, 0x0x55b6a03831c0(0x0x55b6a03831c0, 0x(nil))) k(0 0    0,        0) pad(1 1) pool(0 0, 1 1)] P[ 40] C[ 43]
 43 NN [(   8    9  256 1,    18432, 0x0x55b6a03831c0(0x0x55b6a03831c0, 0x(nil)) ->    7    8  128 1,     7168, 0x0x55b69e07d730(0x0x55b69e07d730, 0x(nil))) k(2 2  256,   149760) pad(0 0) pool(0 0, 1 1)] P[ 42] C[ 44]
 44 NN [(   7    8  128 1,     7168, 0x0x55b69e07d730(0x0x55b69e07d730, 0x(nil)) ->    7    8  128 1,     7168, 0x0x55b69e07e8a0(0x0x55b69e07e8a0, 0x(nil))) k(3 3  128,   164864) pad(1 1) pool(0 0, 1 1)] P[ 43] C[ 45]
 45 SH [(7168    1    1 1,     7168, 0x0x55b69e07e8a0(0x0x55b69e07e8a0, 0x(nil)) -> 7168    1    1 1,     7168, 0x0x55b69e0813f0(0x0x55b69e0813f0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 44, 41] C[ 46, 49]
 46 TP [(   7    8  128 1,     7168, 0x0x55b69e0813f0(0x0x55b69e0813f0, 0x(nil)) ->    7    8  128 1,     7168, 0x0x55b69e082440(0x0x55b69e082440, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 45] C[ 47]
 47 NN [(   7    8  128 1,     7168, 0x0x55b69e082440(0x0x55b69e082440, 0x(nil)) ->    7    8  128 1,     7168, 0x0x55b69e085dc0(0x0x55b69e085dc0, 0x(nil))) k(3 3  128,   166656) pad(1 1) pool(0 0, 1 1)] P[ 46] C[ 48]
 48 NN [(   7    8  128 1,     7168, 0x0x55b69e085dc0(0x0x55b69e085dc0, 0x(nil)) ->    7    8  128 1,     7168, 0x0x55b69e086f30(0x0x55b69e086f30, 0x(nil))) k(3 3  128,   165376) pad(1 1) pool(0 0, 1 1)] P[ 47] C[ 49]
 49 SH [(7168    1    1 1,     7168, 0x0x55b69e086f30(0x0x55b69e086f30, 0x(nil)) -> 7168    1    1 1,     7168, 0x0x55b69e08aad0(0x0x55b69e08aad0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 48, 45] C[ 50]
 50 TP [(   7    8  128 1,     7168, 0x0x55b69e08aad0(0x0x55b69e08aad0, 0x(nil)) ->    3    4  128 1,     1536, 0x0x55b69dfcdd80(0x0x55b69dfcdd80, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(2 2, 2 2)] P[ 49] C[ 51]
 51 SH [(   1 1536    1 1,     1536, 0x0x55b69dfcdd80(0x0x55b69dfcdd80, 0x(nil)) ->    1 1536    1 1,     1536, 0x0x55b69dee0ca0(0x0x55b69dee0ca0, 0x(nil))) k(0 0    0,        0) pad(0 0) pool(0 0, 1 1)] P[ 50]

Detected Segments
AB_VS (0 - 8)
AB_VS (12 - 13)
TL_VS (14 - 15)
AB_VS (15 - 18)
AB_VS (19 - 36)
AB_VS (37 - 39)
AB_VS (42 - 44)
AB_VS (46 - 48)
======================== Block [0 - 8] ==============================
  0 TP DD -> DD [(   43008,    86016), IC(       0), KC(       0), NNT(       0,        0)]
  1 NN DD -> VS [(   43008,    10752), IC(    1792), KC(     256), NNT(       0,        0)]
  2 NN VS -> VS [(   10752,    20224), IC(       0), KC(    1024), NNT(       0,        0)]
  3 NN VS -> VS [(   20224,    17408), IC(       0), KC(    5888), NNT(       0,        0)]
  4 TP VS -> VS [(   17408,     4096), IC(       0), KC(       0), NNT(       0,        0)]
  5 NN VS -> VS [(    4096,     1536), IC(       0), KC(   10752), NNT(       0,        0)]
  6 TP VS -> VS [(    1536,      256), IC(       0), KC(       0), NNT(       0,        0)]
  7 TP VS -> VS [(     256,      256), IC(       0), KC(       0), NNT(       0,        0)]
  8 TP VS -> DD [(     256,       12), IC(       0), KC(       0), NNT(       0,        0)]
------------------------------------------------------------------
Segment AB (0 - 8)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 43520  16.732283%
======================== Block [0 - 8] SUCCEED =========================
======================== Block [12 - 13] ==============================
 12 TP DD -> VS [(   86016,    43008), IC(       0), KC(       0), NNT(       0,        0)]
 13 NN VS -> DD [(   43008,        0), IC(       0), KC(    1024), NNT(       0,        0)]
------------------------------------------------------------------
Segment AB (12 - 13)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 44032  16.929132%
======================== Block [12 - 13] SUCCEED =========================
======================== Block [14 - 18] ==============================
 14 TP DD -> VS [(       0,   125952), IC(       0), KC(       0), NNT(       0,        0)]
 15 NN VS -> VS [(  125952,    86016), IC(       0), KC(    7424), NNT(       0,        0)]
 16 NN VS -> VS [(   86016,   172032), IC(       0), KC(    5120), NNT(       0,        0)]
 17 NN DD -> VS [(       0,   172032), IC(    3072), KC(    1024), NNT(       0,        0)]
 18 NN VS -> DD [(  172032,        0), IC(       0), KC(     768), NNT(       0,        0)]
------------------------------------------------------------------
Segment Tiling (14 - 15)
[DD   1(   0,   1)(   0) ->VS   1(   0,   1)(  23) P( 1) F(1)]    [VS  23(   0,  23)(  23) ->VS  22(   0,  22)(  64) P( 0) F(0)]    
[DD  44(   1,  45)(   0) ->VS  22(   1,  23)(  23) P( 0) F(1)]    [VS  23(  22,  45)(  23) ->VS  22(  22,  44)(  64) P( 0) F(0)]    
[DD  44(  45,  89)(   0) ->VS  22(  23,  45)(  23) P( 0) F(1)]    [VS  21(  44,  65)(  23) ->VS  20(  44,  64)(  64) P( 0) F(1)]    
[DD  39(  89, 128)(   0) ->VS  20(  45,  65)(  23) P( 0) F(1)]                                                                      

   0 [(  14,    0)  F(1)]
   1 [(  14,    1)  F(1)]
   2 [(  15,    0)  F(0)]   3 [(  14,    2)  F(1)]
   4 [(  15,    1)  F(0)]   5 [(  14,    3)  F(1)]
   6 [(  15,    2)  F(1)]

AXISRAM: Estimate used 0  0.000000%  VIPSRAM: Estimate used 136352  52.423721% M = 22
------------------------------------------------------------------
Segment AB (16 - 18)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 260096  100.000000%
======================== Block [14 - 18] SUCCEED =========================
======================== Block [19 - 36] ==============================
 19 TP DD -> VS [(       0,    86016), IC(       0), KC(       0), NNT(       0,        0)]
 20 NN VS -> VS [(   86016,    86016), IC(       0), KC(    5888), NNT(       0,        0)]
 21 NN VS -> DD [(   86016,        0), IC(       0), KC(    5888), NNT(       0,        0)]
 22 NN DD -> VS [(       0,    86016), IC(    8192), KC(     768), NNT(       0,        0)]
 23 TP VS -> VS [(   86016,    91904), IC(       0), KC(       0), NNT(       0,        0)]
 24 NN VS -> VS [(   91904,    43008), IC(       0), KC(   14848), NNT(       0,        0)]
 25 NN VS -> VS [(   43008,    86016), IC(       0), KC(   21248), NNT(       0,        0)]
 26 NN VS -> VS [(   86016,    86016), IC(       0), KC(    1280), NNT(       0,        0)]
 27 NN VS -> VS [(   86016,    86016), IC(       0), KC(    1024), NNT(       0,        0)]
 28 TP VS -> VS [(   86016,    43008), IC(       0), KC(       0), NNT(       0,        0)]
 29 NN VS -> VS [(   43008,    43008), IC(       0), KC(   20992), NNT(       0,        0)]
 30 NN VS -> VS [(   43008,    86016), IC(       0), KC(   20224), NNT(       0,        0)]
 31 NN VS -> VS [(   86016,    43008), IC(       0), KC(    1024), NNT(       0,        0)]
 32 NN VS -> VS [(   43008,    28672), IC(       0), KC(    3584), NNT(       0,        0)]
 33 TP VS -> VS [(   43008,    49152), IC(       0), KC(       0), NNT(       0,        0)]
 34 NN VS -> VS [(   49152,    14336), IC(       0), KC(   35072), NNT(       0,        0)]
 35 NN VS -> VS [(   14336,    28672), IC(       0), KC(   38912), NNT(       0,        0)]
 36 NN VS -> DD [(   28672,        0), IC(       0), KC(    1280), NNT(       0,        0)]
------------------------------------------------------------------
Segment AB (19 - 36)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 236288  90.846458%
======================== Block [19 - 36] SUCCEED =========================
======================== Block [37 - 39] ==============================
 37 TP DD -> VS [(       0,    14336), IC(       0), KC(       0), NNT(       0,        0)]
 38 NN VS -> VS [(   14336,    14336), IC(       0), KC(   36352), NNT(       0,        0)]
 39 NN VS -> DD [(   14336,        0), IC(       0), KC(   38144), NNT(       0,        0)]
------------------------------------------------------------------
Segment AB (37 - 39)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 65024  25.000000%
======================== Block [37 - 39] SUCCEED =========================
======================== Block [42 - 44] ==============================
 42 TP DD -> VS [(       0,    18432), IC(       0), KC(       0), NNT(       0,        0)]
 43 NN VS -> VS [(   18432,     7168), IC(       0), KC(   94720), NNT(       0,        0)]
 44 NN VS -> DD [(    7168,        0), IC(       0), KC(  127488), NNT(       0,        0)]
------------------------------------------------------------------
Segment AB (42 - 44)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 134656  51.771652%
======================== Block [42 - 44] SUCCEED =========================
======================== Block [46 - 48] ==============================
 46 TP DD -> VS [(       0,     7168), IC(       0), KC(       0), NNT(       0,        0)]
 47 NN VS -> VS [(    7168,     7168), IC(       0), KC(  116480), NNT(       0,        0)]
 48 NN VS -> DD [(    7168,        0), IC(       0), KC(  123904), NNT(       0,        0)]
------------------------------------------------------------------
Segment AB (46 - 48)

AXISRAM: Peak used 0  0.000000% VIPSRAM: Peak used 131072  50.393700%
======================== Block [46 - 48] SUCCEED =========================

 id IN [ x  y  w   h ]   OUT  [ x  y  w  h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type)
   0 TP DD 0x(nil) [   0    0      112      128] -> DD 0x(nil) [   0    0      112      128] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
   1 NN DD 0x(nil) [   0    0      112      128] -> VS 0x0xff000800 [   0    0       56       64] ( 64,   8,   1) (    1792,      256, 100.000000%, 100.000000%, DD)
   2 NN VS 0x0xff000800 [   0    0       56       64] -> VS 0x0xff006300 [   0    0       27       31] ( 28,  16,   3) (       0,     1024, 100.000000%, 100.000000%, DD)
   3 NN VS 0x0xff006300 [   0    0       27       31] -> VS 0x0xff000800 [   0    0       25       29] ( 25,  12,   6) (       0,     5888, 100.000000%, 100.000000%, DD)
   4 TP VS 0x0xff000800 [   0    0       25       29] -> VS 0x0xff00a200 [   0    0       12       14] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
   5 NN VS 0x0xff00a200 [   0    0       12       14] -> VS 0x0xff000800 [   0    0        5        6] ( 10,  12,  12) (       0,    10752, 100.000000%, 93.333334%, DD)
   6 TP VS 0x0xff000800 [   0    0     1440        1] -> VS 0x0xff00b100 [   0    0       32        1] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
   7 TP VS 0x0xff00b100 [   0    0       32        1] -> VS 0x0xff000800 [   0    0        6        1] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
   8 TP VS 0x0xff000800 [   0    0        6        1] -> DD 0x(nil) [   0    0        6        1] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
   9 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  10 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  11 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  12 TP DD 0x(nil) [   0    0      112      128] -> VS 0x0xff000800 [   0    0      112      128] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  13 NN VS 0x0xff000800 [   0    0      112      128] -> DD 0x(nil) [   0    0      112      128] ( 64,   4,   6) (       0,     1024, 100.000000%, 100.000000%, DD)
  14 TP DD 0x(nil) [   0    0      112        1] -> VS 0x0xff000800 [   0    0       57        1] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  14 TP DD 0x0x70 [   0    1      112       44] -> VS 0x0xff000839 [   0    1       57       22] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  15 NN VS 0x0xff000800 [   0    0       57       23] -> VS 0x0xff02b000 [   0    0       56       22] ( 56,   8,   3) (       0,     7424, 100.000000%, 70.731705%, DD)
  14 TP DD 0x0x13b0 [   0   45      112       44] -> VS 0x0xff000d1f [   0   23       57       22] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  15 NN VS 0x0xff000ce6 [   0   22       57       23] -> VS 0x0xff02b4d0 [   0   22       56       22] ( 56,   8,   3) (       0,     7424, 100.000000%, 70.731705%, DD)
  14 TP DD 0x0x26f0 [   0   89      112       39] -> VS 0x0xff001205 [   0   45       57       20] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  15 NN VS 0x0xff0011cc [   0   44       57       21] -> VS 0x0xff02b9a0 [   0   44       56       20] ( 56,   8,   3) (       0,     7424, 100.000000%, 70.731705%, DD)
  16 NN VS 0x0xff02b000 [   0    0       56       64] -> VS 0x0xff000800 [   0    0       56       64] ( 28,   9,   6) (       0,     2048, 40.000001%, 86.956519%, DD)
  17 NN DD 0x(nil) [   0    0      112      128] -> VS 0x0xff015800 [   0    0       56       64] ( 64,   2,   6) (    3072,     1024, 100.000000%, 133.333337%, DD)
  18 NN VS 0x0xff000800 [   0    0       56       64] -> DD 0x(nil) [   0    0       56       64] ( 56,   8,   3) (       0,      768, 100.000000%, 50.000000%, DD)
  19 TP DD 0x(nil) [   0    0       56       64] -> VS 0x0xff000800 [   0    0       56       64] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  20 NN VS 0x0xff000800 [   0    0       56       64] -> VS 0x0xff025300 [   0    0       56       64] ( 28,   9,   6) (       0,     5888, 100.000000%, 100.000000%, DD)
  21 NN VS 0x0xff025300 [   0    0       56       64] -> DD 0x(nil) [   0    0       56       64] ( 56,   6,   3) (       0,     5888, 100.000000%, 100.000000%, DD)
  22 NN DD 0x(nil) [   0    0       56       64] -> VS 0x0xff000800 [   0    0       56       64] ( 56,   3,   6) (    8192,      768, 100.000000%, 50.000000%, DD)
  23 TP VS 0x0xff000800 [   0    0       56       64] -> VS 0x0xff023c00 [   0    0       29       33] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  24 NN VS 0x0xff023c00 [   0    0       29       33] -> VS 0x0xff015800 [   0    0       28       32] ( 28,  16,   4) (       0,    14848, 100.000000%, 70.731705%, DD)
  25 NN VS 0x0xff015800 [   0    0       28       32] -> VS 0x0xff025300 [   0    0       28       32] ( 28,  16,   4) (       0,    21248, 100.000000%, 93.258429%, DD)
  26 NN VS 0x0xff000800 [   0    0       56       64] -> VS 0x0xff02fb00 [   0    0       28       32] ( 56,   2,  12) (       0,     1280, 100.000000%, 83.333331%, DD)
  27 NN VS 0x0xff025300 [   0    0       28       32] -> VS 0x0xff00b000 [   0    0       28       32] ( 28,   4,  12) (       0,     1024, 100.000000%, 16.666667%, DD)
  28 TP VS 0x0xff00b000 [   0    0       28       32] -> VS 0x0xff02fb00 [   0    0       28       32] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  29 NN VS 0x0xff02fb00 [   0    0       28       32] -> VS 0x0xff015800 [   0    0       28       32] ( 28,  16,   4) (       0,    20992, 100.000000%, 91.111112%, DD)
  30 NN VS 0x0xff015800 [   0    0       28       32] -> VS 0x0xff000800 [   0    0       28       32] ( 28,  16,   4) (       0,    20224, 100.000000%, 87.777776%, DD)
  31 NN VS 0x0xff000800 [   0    0       28       32] -> VS 0x0xff02fb00 [   0    0       28       32] ( 28,   4,  12) (       0,     1024, 100.000000%, 16.666667%, DD)
  32 NN VS 0x0xff02fb00 [   0    0       28       32] -> VS 0x0xff004000 [   0    0       14       16] ( 28,   4,  16) (       0,     3584, 100.000000%, 100.000000%, DD)
  33 TP VS 0x0xff02fb00 [   0    0       28       32] -> VS 0x0xff007800 [   0    0       15       17] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  34 NN VS 0x0xff007800 [   0    0       15       17] -> VS 0x0xff036b00 [   0    0       14       16] ( 14,  13,   8) (       0,    35072, 100.000000%, 62.272727%, DD)
  35 NN VS 0x0xff036b00 [   0    0       14       16] -> VS 0x0xff000800 [   0    0       14       16] ( 14,  12,   8) (       0,    38912, 100.000000%, 96.202534%, DD)
  36 NN VS 0x0xff000800 [   0    0       14       16] -> DD 0x(nil) [   0    0       14       16] ( 14,  16,   8) (       0,     1280, 100.000000%, 12.195122%, DD)
  37 TP DD 0x(nil) [   0    0       14       16] -> VS 0x0xff000800 [   0    0       14       16] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  38 NN VS 0x0xff000800 [   0    0       14       16] -> VS 0x0xff00ce00 [   0    0       14       16] ( 14,  12,   8) (       0,    36352, 100.000000%, 89.308178%, DD)
  39 NN VS 0x0xff00ce00 [   0    0       14       16] -> DD 0x(nil) [   0    0       14       16] ( 14,  16,   8) (       0,    38144, 100.000000%, 94.303799%, DD)
  40 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  41 NN DD 0x(nil) [   0    0       14       16] -> DD 0x(nil) [   0    0        7        8] ( 14,  16,   8) (   14848,     9728, 100.000000%, 102.702701%, DD)
  42 TP DD 0x(nil) [   0    0       14       16] -> VS 0x0xff000800 [   0    0        8        9] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  43 NN VS 0x0xff000800 [   0    0        8        9] -> VS 0x0xff01fa00 [   0    0        7        8] (  7,   8,  16) (       0,    94720, 100.000000%, 63.247865%, DD)
  44 NN VS 0x0xff01fa00 [   0    0        7        8] -> DD 0x(nil) [   0    0        7        8] (  7,   8,  16) (       0,   127488, 100.000000%, 77.329195%, DD)
  45 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  46 TP DD 0x(nil) [   0    0        7        8] -> VS 0x0xff000800 [   0    0        7        8] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  47 NN VS 0x0xff000800 [   0    0        7        8] -> VS 0x0xff01ec00 [   0    0        7        8] (  7,   8,  16) (       0,   116480, 100.000000%, 69.892472%, DD)
  48 NN VS 0x0xff01ec00 [   0    0        7        8] -> DD 0x(nil) [   0    0        7        8] (  7,   8,  16) (       0,   123904, 100.000000%, 74.922603%, DD)
  49 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  50 TP DD 0x(nil) [   0    0        7        8] -> DD 0x(nil) [   0    0        3        4] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)
  51 SH DD 0x(nil) [   0    0        0        0] -> DD 0x(nil) [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.000000%, 0.000000%, NONE)

PreLoadWeightBiases = 0  -nan%
---------------------------End VerifyTiling -------------------------
vxo_insertHandle(out): old physical address: 00296300, mapIndex: 2, node_id: 15
success prerun_graph
##################VXEngineRun
VXEngineRun 0 4
prev_ptrs = 0x55b69df8cac0

Warning: swapHandel, CMD changed

 NN/TP: pre_physical:0x00296940, new_physical:0x00296940 

 NN/TP: pre_physical:0x00296978, new_physical:0x00296978 

Warning: swapHandel, CMD changed

 NN/TP: pre_physical:0x00296940, new_physical:0x00296940 

Warning: swapHandel, CMD changed

 SH:  pre_physical:0x00296300, new_physical:0x00296300, mapIndex: 2, node_id: 15
prev_ptrs = 0x55b69df8c080
thezha commented 3 years ago

Is the following layer SpecialTransformer?

10 SH [( 6 1 1 1, 12, 0x0x55b69dfe7f90(0x0x55b69dfe7f90, 0x(nil)) -> 224 128 1 1, 57344, 0x0x55b69dfe8da0(0x0x55b69dfe8da0, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 9] C[ 11]

Input shape(1, 1, 6), output shape(1, 128, 224)?

sky-fun commented 3 years ago

yes, but in the origin network target shape of SpecialTransformer should be (3,128, 112)

nightingalei commented 3 years ago

@sky-fun What's the datatype and parameters to the SpacialTransformer?

nightingalei commented 3 years ago

@sky-fun Could please also provide the log with the env "export VSI_NN_LOG_LEVEL=5"? Thanks.

sky-fun commented 3 years ago

datatype is uint8 and the parameters is (output_h, output_w, false, false, false, false, false, false, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

sky-fun commented 3 years ago

@nightingalei


D [vsi_nn_SetupGraph:617]Sort graph nodes.
no Unprocessed node 
D [setup_node:441]Setup node id[0] uid[1] op[POOL]
D [validate_op_io_types:165]Validate [POOL]
D [print_tensor:146]in(0) : id[   1] vtl[0] const[0] shape[ 112, 128, 3, 1   ] fmt[u8 ] qnt[ASM zp=127, scale=0.007782]
D [print_tensor:146]out(0): id[   4] vtl[1] const[0] shape[ 56, 64, 3, 1     ] fmt[u8 ] qnt[ASM zp=127, scale=0.007805]
D [setup_node:441]Setup node id[1] uid[2] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[   4] vtl[1] const[0] shape[ 56, 64, 3, 1     ] fmt[u8 ] qnt[ASM zp=127, scale=0.007805]
D [print_tensor:146]in(1) : id[   3] vtl[0] const[1] shape[ 3, 3, 3, 24      ] fmt[u8 ] qnt[ASM zp=144, scale=0.008708]
D [print_tensor:146]in(2) : id[   2] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000068]
D [print_tensor:146]out(0): id[ 105] vtl[1] const[0] shape[ 54, 62, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.014651]
D [setup_node:441]Setup node id[2] uid[3] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 105] vtl[1] const[0] shape[ 54, 62, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.014651]
D [print_tensor:146]out(0): id[   5] vtl[1] const[0] shape[ 54, 62, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.014651]
D [setup_node:441]Setup node id[3] uid[4] op[POOL]
D [validate_op_io_types:165]Validate [POOL]
D [print_tensor:146]in(0) : id[   5] vtl[1] const[0] shape[ 54, 62, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.014651]
D [print_tensor:146]out(0): id[   8] vtl[1] const[0] shape[ 27, 31, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.014651]
D [setup_node:441]Setup node id[4] uid[5] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[   8] vtl[1] const[0] shape[ 27, 31, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.014651]
D [print_tensor:146]in(1) : id[   7] vtl[0] const[1] shape[ 3, 3, 24, 24     ] fmt[u8 ] qnt[ASM zp=123, scale=0.003316]
D [print_tensor:146]in(2) : id[   6] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000049]
D [print_tensor:146]out(0): id[ 106] vtl[1] const[0] shape[ 25, 29, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.006747]
D [setup_node:441]Setup node id[5] uid[6] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 106] vtl[1] const[0] shape[ 25, 29, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.006747]
D [print_tensor:146]out(0): id[   9] vtl[1] const[0] shape[ 25, 29, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.006747]
D [setup_node:441]Setup node id[6] uid[7] op[POOL]
D [validate_op_io_types:165]Validate [POOL]
D [print_tensor:146]in(0) : id[   9] vtl[1] const[0] shape[ 25, 29, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.006747]
D [print_tensor:146]out(0): id[  12] vtl[1] const[0] shape[ 12, 14, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.006747]
D [setup_node:441]Setup node id[7] uid[8] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  12] vtl[1] const[0] shape[ 12, 14, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.006747]
D [print_tensor:146]in(1) : id[  11] vtl[0] const[1] shape[ 3, 3, 24, 48     ] fmt[u8 ] qnt[ASM zp=132, scale=0.003830]
D [print_tensor:146]in(2) : id[  10] vtl[0] const[1] shape[ 48               ] fmt[i32] qnt[ASM zp=  0, scale=0.000026]
D [print_tensor:146]out(0): id[ 107] vtl[1] const[0] shape[ 10, 12, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [setup_node:441]Setup node id[8] uid[9] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 107] vtl[1] const[0] shape[ 10, 12, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [print_tensor:146]out(0): id[  13] vtl[1] const[0] shape[ 10, 12, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [setup_node:441]Setup node id[9] uid[10] op[POOL]
D [validate_op_io_types:165]Validate [POOL]
D [print_tensor:146]in(0) : id[  13] vtl[1] const[0] shape[ 10, 12, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [print_tensor:146]out(0): id[  14] vtl[1] const[0] shape[ 5, 6, 48, 1      ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [setup_node:441]Setup node id[10] uid[11] op[RESHAPE]
D [print_tensor:146]in(0) : id[  14] vtl[1] const[0] shape[ 5, 6, 48, 1      ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [print_tensor:146]out(0): id[  15] vtl[1] const[0] shape[ 1, 1, 1440, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [setup_node:441]Setup node id[11] uid[12] op[FCL2]
D [validate_op_io_types:165]Validate [FCL_RELU]
D [print_tensor:146]in(0) : id[  15] vtl[1] const[0] shape[ 1, 1, 1440, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004216]
D [print_tensor:146]in(1) : id[  17] vtl[0] const[1] shape[ 1440, 32         ] fmt[u8 ] qnt[ASM zp=167, scale=0.005804]
D [print_tensor:146]in(2) : id[  16] vtl[0] const[1] shape[ 32               ] fmt[i32] qnt[ASM zp=  0, scale=0.000024]
D [print_tensor:146]out(0): id[  18] vtl[1] const[0] shape[ 32, 1            ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005455]
D [setup_node:441]Setup node id[12] uid[13] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  18] vtl[1] const[0] shape[ 32, 1            ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005455]
D [print_tensor:146]out(0): id[  19] vtl[1] const[0] shape[ 32, 1            ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005455]
D [setup_node:441]Setup node id[13] uid[14] op[FCL2]
D [validate_op_io_types:165]Validate [FCL_RELU]
D [print_tensor:146]in(0) : id[  19] vtl[1] const[0] shape[ 32, 1            ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005455]
D [print_tensor:146]in(1) : id[  21] vtl[0] const[1] shape[ 32, 6            ] fmt[u8 ] qnt[ASM zp=121, scale=0.007150]
D [print_tensor:146]in(2) : id[  20] vtl[0] const[1] shape[ 6                ] fmt[i32] qnt[ASM zp=  0, scale=0.000039]
D [print_tensor:146]out(0): id[  22] vtl[1] const[0] shape[ 6, 1             ] fmt[u8 ] qnt[ASM zp=114, scale=0.013708]
D [setup_node:441]Setup node id[14] uid[15] op[SPATIAL_TRANSFORMER]
D [print_tensor:146]in(0) : id[   1] vtl[0] const[0] shape[ 112, 128, 3, 1   ] fmt[u8 ] qnt[ASM zp=127, scale=0.007782]
D [print_tensor:146]in(1) : id[  22] vtl[1] const[0] shape[ 6, 1             ] fmt[u8 ] qnt[ASM zp=114, scale=0.013708]
D [print_tensor:146]out(0): id[  25] vtl[1] const[0] shape[ 112, 128, 3, 1   ] fmt[u8 ] qnt[ASM zp=127, scale=0.007805]
D [setup_node:441]Setup node id[15] uid[16] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  25] vtl[1] const[0] shape[ 112, 128, 3, 1   ] fmt[u8 ] qnt[ASM zp=127, scale=0.007805]
D [print_tensor:146]in(1) : id[  24] vtl[0] const[1] shape[ 3, 3, 3, 24      ] fmt[u8 ] qnt[ASM zp= 98, scale=0.020664]
D [print_tensor:146]in(2) : id[  23] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000161]
D [print_tensor:146]out(0): id[ 108] vtl[1] const[0] shape[ 112, 128, 24, 1  ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003564]
D [setup_node:441]Setup node id[16] uid[17] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 108] vtl[1] const[0] shape[ 112, 128, 24, 1  ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003564]
D [print_tensor:146]out(0): id[  28] vtl[1] const[0] shape[ 112, 128, 24, 1  ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003564]
D [setup_node:441]Setup node id[17] uid[18] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  28] vtl[1] const[0] shape[ 112, 128, 24, 1  ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003564]
D [print_tensor:146]in(1) : id[  27] vtl[0] const[1] shape[ 3, 3, 24, 24     ] fmt[u8 ] qnt[ASM zp=161, scale=0.007057]
D [print_tensor:146]in(2) : id[  26] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000025]
D [print_tensor:146]out(0): id[ 109] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003384]
D [setup_node:441]Setup node id[18] uid[19] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 109] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003384]
D [print_tensor:146]out(0): id[  31] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003384]
D [setup_node:441]Setup node id[19] uid[20] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  28] vtl[1] const[0] shape[ 112, 128, 24, 1  ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003564]
D [print_tensor:146]in(1) : id[  30] vtl[0] const[1] shape[ 1, 1, 24, 24     ] fmt[u8 ] qnt[ASM zp=138, scale=0.003668]
D [print_tensor:146]in(2) : id[  29] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000013]
D [print_tensor:146]out(0): id[  34] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp= 68, scale=0.005292]
D [setup_node:441]Setup node id[20] uid[21] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  31] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.003384]
D [print_tensor:146]in(1) : id[  32] vtl[0] const[1] shape[ 3, 3, 24, 24     ] fmt[u8 ] qnt[ASM zp=118, scale=0.009258]
D [print_tensor:146]in(2) : id[  33] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000031]
D [print_tensor:146]out(0): id[  35] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp= 96, scale=0.013286]
D [setup_node:441]Setup node id[21] uid[22] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  35] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp= 96, scale=0.013286]
D [print_tensor:146]in(1) : id[  34] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp= 68, scale=0.005292]
D [print_tensor:146]out(0): id[  36] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=100, scale=0.011393]
D [setup_node:441]Setup node id[22] uid[23] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  36] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=100, scale=0.011393]
D [print_tensor:146]out(0): id[  39] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004456]
D [setup_node:441]Setup node id[23] uid[24] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  39] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.004456]
D [print_tensor:146]in(1) : id[  38] vtl[0] const[1] shape[ 3, 3, 24, 24     ] fmt[u8 ] qnt[ASM zp=111, scale=0.006807]
D [print_tensor:146]in(2) : id[  37] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000030]
D [print_tensor:146]out(0): id[ 110] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005790]
D [setup_node:441]Setup node id[24] uid[25] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 110] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005790]
D [print_tensor:146]out(0): id[  40] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005790]
D [setup_node:441]Setup node id[25] uid[26] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  40] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.005790]
D [print_tensor:146]in(1) : id[  41] vtl[0] const[1] shape[ 3, 3, 24, 24     ] fmt[u8 ] qnt[ASM zp=112, scale=0.006564]
D [print_tensor:146]in(2) : id[  42] vtl[0] const[1] shape[ 24               ] fmt[i32] qnt[ASM zp=  0, scale=0.000038]
D [print_tensor:146]out(0): id[  43] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=123, scale=0.044463]
D [setup_node:441]Setup node id[26] uid[27] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  43] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=123, scale=0.044463]
D [print_tensor:146]in(1) : id[  36] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=100, scale=0.011393]
D [print_tensor:146]out(0): id[  44] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.021985]
D [setup_node:441]Setup node id[27] uid[28] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  44] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.021985]
D [print_tensor:146]out(0): id[  47] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.021985]
D [setup_node:441]Setup node id[28] uid[29] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  47] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.021985]
D [print_tensor:146]in(1) : id[  46] vtl[0] const[1] shape[ 3, 3, 24, 48     ] fmt[u8 ] qnt[ASM zp=106, scale=0.004273]
D [print_tensor:146]in(2) : id[  45] vtl[0] const[1] shape[ 48               ] fmt[i32] qnt[ASM zp=  0, scale=0.000094]
D [print_tensor:146]out(0): id[ 111] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.029140]
D [setup_node:441]Setup node id[29] uid[30] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 111] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.029140]
D [print_tensor:146]out(0): id[  50] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.029140]
D [setup_node:441]Setup node id[30] uid[31] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  47] vtl[1] const[0] shape[ 56, 64, 24, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.021985]
D [print_tensor:146]in(1) : id[  48] vtl[0] const[1] shape[ 1, 1, 24, 48     ] fmt[u8 ] qnt[ASM zp=124, scale=0.005828]
D [print_tensor:146]in(2) : id[  49] vtl[0] const[1] shape[ 48               ] fmt[i32] qnt[ASM zp=  0, scale=0.000128]
D [print_tensor:146]out(0): id[  53] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=141, scale=0.027526]
D [setup_node:441]Setup node id[31] uid[32] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  50] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.029140]
D [print_tensor:146]in(1) : id[  52] vtl[0] const[1] shape[ 3, 3, 48, 48     ] fmt[u8 ] qnt[ASM zp=125, scale=0.004647]
D [print_tensor:146]in(2) : id[  51] vtl[0] const[1] shape[ 48               ] fmt[i32] qnt[ASM zp=  0, scale=0.000135]
D [print_tensor:146]out(0): id[  54] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=134, scale=0.114629]
D [setup_node:441]Setup node id[32] uid[33] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  54] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=134, scale=0.114629]
D [print_tensor:146]in(1) : id[  53] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=141, scale=0.027526]
D [print_tensor:146]out(0): id[  55] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=146, scale=0.112530]
D [setup_node:441]Setup node id[33] uid[34] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  55] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=146, scale=0.112530]
D [print_tensor:146]out(0): id[  58] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.044880]
D [setup_node:441]Setup node id[34] uid[35] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  58] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.044880]
D [print_tensor:146]in(1) : id[  57] vtl[0] const[1] shape[ 3, 3, 48, 48     ] fmt[u8 ] qnt[ASM zp=105, scale=0.006360]
D [print_tensor:146]in(2) : id[  56] vtl[0] const[1] shape[ 48               ] fmt[i32] qnt[ASM zp=  0, scale=0.000285]
D [print_tensor:146]out(0): id[ 112] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.086943]
D [setup_node:441]Setup node id[35] uid[36] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 112] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.086943]
D [print_tensor:146]out(0): id[  59] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.086943]
D [setup_node:441]Setup node id[36] uid[37] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  59] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.086943]
D [print_tensor:146]in(1) : id[  60] vtl[0] const[1] shape[ 3, 3, 48, 48     ] fmt[u8 ] qnt[ASM zp=164, scale=0.007681]
D [print_tensor:146]in(2) : id[  61] vtl[0] const[1] shape[ 48               ] fmt[i32] qnt[ASM zp=  0, scale=0.000668]
D [print_tensor:146]out(0): id[  62] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=159, scale=0.370318]
D [setup_node:441]Setup node id[37] uid[38] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  62] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=159, scale=0.370318]
D [print_tensor:146]in(1) : id[  55] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=146, scale=0.112530]
D [print_tensor:146]out(0): id[  63] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.163684]
D [setup_node:441]Setup node id[38] uid[39] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  63] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.163684]
D [print_tensor:146]out(0): id[  66] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.163684]
D [setup_node:441]Setup node id[41] uid[42] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  66] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.163684]
D [print_tensor:146]in(1) : id[  67] vtl[0] const[1] shape[ 1, 1, 48, 64     ] fmt[u8 ] qnt[ASM zp=129, scale=0.003319]
D [print_tensor:146]in(2) : id[  68] vtl[0] const[1] shape[ 64               ] fmt[i32] qnt[ASM zp=  0, scale=0.000543]
D [print_tensor:146]out(0): id[  72] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=103, scale=0.108540]
D [setup_node:441]Setup node id[39] uid[40] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  66] vtl[1] const[0] shape[ 28, 32, 48, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.163684]
D [print_tensor:146]in(1) : id[  65] vtl[0] const[1] shape[ 3, 3, 48, 64     ] fmt[u8 ] qnt[ASM zp=129, scale=0.005332]
D [print_tensor:146]in(2) : id[  64] vtl[0] const[1] shape[ 64               ] fmt[i32] qnt[ASM zp=  0, scale=0.000873]
D [print_tensor:146]out(0): id[ 113] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.231447]
D [setup_node:441]Setup node id[40] uid[41] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 113] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.231447]
D [print_tensor:146]out(0): id[  69] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.231447]
D [setup_node:441]Setup node id[42] uid[43] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  69] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.231447]
D [print_tensor:146]in(1) : id[  71] vtl[0] const[1] shape[ 3, 3, 64, 64     ] fmt[u8 ] qnt[ASM zp=134, scale=0.003610]
D [print_tensor:146]in(2) : id[  70] vtl[0] const[1] shape[ 64               ] fmt[i32] qnt[ASM zp=  0, scale=0.000836]
D [print_tensor:146]out(0): id[  73] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=140, scale=0.829758]
D [setup_node:441]Setup node id[43] uid[44] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  73] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=140, scale=0.829758]
D [print_tensor:146]in(1) : id[  72] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=103, scale=0.108540]
D [print_tensor:146]out(0): id[  74] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=138, scale=0.906703]
D [setup_node:441]Setup node id[44] uid[45] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  74] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=138, scale=0.906703]
D [print_tensor:146]out(0): id[  77] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.397434]
D [setup_node:441]Setup node id[45] uid[46] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  77] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.397434]
D [print_tensor:146]in(1) : id[  76] vtl[0] const[1] shape[ 3, 3, 64, 64     ] fmt[u8 ] qnt[ASM zp= 94, scale=0.005339]
D [print_tensor:146]in(2) : id[  75] vtl[0] const[1] shape[ 64               ] fmt[i32] qnt[ASM zp=  0, scale=0.002122]
D [print_tensor:146]out(0): id[ 114] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.446657]
D [setup_node:441]Setup node id[46] uid[47] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 114] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.446657]
D [print_tensor:146]out(0): id[  78] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.446657]
D [setup_node:441]Setup node id[47] uid[48] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  78] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.446657]
D [print_tensor:146]in(1) : id[  80] vtl[0] const[1] shape[ 3, 3, 64, 64     ] fmt[u8 ] qnt[ASM zp=124, scale=0.004056]
D [print_tensor:146]in(2) : id[  79] vtl[0] const[1] shape[ 64               ] fmt[i32] qnt[ASM zp=  0, scale=0.001812]
D [print_tensor:146]out(0): id[  81] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=184, scale=2.196078]
D [setup_node:441]Setup node id[48] uid[49] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  81] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=184, scale=2.196078]
D [print_tensor:146]in(1) : id[  74] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=138, scale=0.906703]
D [print_tensor:146]out(0): id[  82] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.586414]
D [setup_node:441]Setup node id[49] uid[50] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  82] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.586414]
D [print_tensor:146]out(0): id[  85] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.586414]
D [setup_node:441]Setup node id[52] uid[53] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  85] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.586414]
D [print_tensor:146]in(1) : id[  86] vtl[0] const[1] shape[ 1, 1, 64, 128    ] fmt[u8 ] qnt[ASM zp=118, scale=0.002226]
D [print_tensor:146]in(2) : id[  87] vtl[0] const[1] shape[ 128              ] fmt[i32] qnt[ASM zp=  0, scale=0.001305]
D [print_tensor:146]out(0): id[  91] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=138, scale=0.493036]
D [setup_node:441]Setup node id[50] uid[51] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  85] vtl[1] const[0] shape[ 14, 16, 64, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.586414]
D [print_tensor:146]in(1) : id[  84] vtl[0] const[1] shape[ 3, 3, 64, 128    ] fmt[u8 ] qnt[ASM zp=124, scale=0.004449]
D [print_tensor:146]in(2) : id[  83] vtl[0] const[1] shape[ 128              ] fmt[i32] qnt[ASM zp=  0, scale=0.002609]
D [print_tensor:146]out(0): id[ 115] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.594060]
D [setup_node:441]Setup node id[51] uid[52] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 115] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.594060]
D [print_tensor:146]out(0): id[  88] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.594060]
D [setup_node:441]Setup node id[53] uid[54] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  88] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.594060]
D [print_tensor:146]in(1) : id[  90] vtl[0] const[1] shape[ 3, 3, 128, 128   ] fmt[u8 ] qnt[ASM zp=129, scale=0.006039]
D [print_tensor:146]in(2) : id[  89] vtl[0] const[1] shape[ 128              ] fmt[i32] qnt[ASM zp=  0, scale=0.003588]
D [print_tensor:146]out(0): id[  92] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=155, scale=1.879793]
D [setup_node:441]Setup node id[54] uid[55] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[  92] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=155, scale=1.879793]
D [print_tensor:146]in(1) : id[  91] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=138, scale=0.493036]
D [print_tensor:146]out(0): id[  93] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=153, scale=1.886373]
D [setup_node:441]Setup node id[55] uid[56] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[  93] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=153, scale=1.886373]
D [print_tensor:146]out(0): id[  96] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.714904]
D [setup_node:441]Setup node id[56] uid[57] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  96] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.714904]
D [print_tensor:146]in(1) : id[  95] vtl[0] const[1] shape[ 3, 3, 128, 128   ] fmt[u8 ] qnt[ASM zp=178, scale=0.010573]
D [print_tensor:146]in(2) : id[  94] vtl[0] const[1] shape[ 128              ] fmt[i32] qnt[ASM zp=  0, scale=0.007559]
D [print_tensor:146]out(0): id[ 116] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=1.100026]
D [setup_node:441]Setup node id[57] uid[58] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 116] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=1.100026]
D [print_tensor:146]out(0): id[  97] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=1.100026]
D [setup_node:441]Setup node id[58] uid[59] op[CONV2D]
D [validate_op_io_types:165]Validate [CONV2D]
D [print_tensor:146]in(0) : id[  97] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=1.100026]
D [print_tensor:146]in(1) : id[  99] vtl[0] const[1] shape[ 3, 3, 128, 128   ] fmt[u8 ] qnt[ASM zp=163, scale=0.008378]
D [print_tensor:146]in(2) : id[  98] vtl[0] const[1] shape[ 128              ] fmt[i32] qnt[ASM zp=  0, scale=0.009216]
D [print_tensor:146]out(0): id[ 100] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=197, scale=4.976527]
D [setup_node:441]Setup node id[59] uid[60] op[ADD]
D [validate_op_io_types:165]Validate [ADD]
D [print_tensor:146]in(0) : id[ 100] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=197, scale=4.976527]
D [print_tensor:146]in(1) : id[  93] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=153, scale=1.886373]
D [print_tensor:146]out(0): id[ 101] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [setup_node:441]Setup node id[60] uid[61] op[RELU]
D [validate_op_io_types:165]Validate [RELU]
D [print_tensor:146]in(0) : id[ 101] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [print_tensor:146]out(0): id[ 102] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [setup_node:441]Setup node id[61] uid[62] op[POOL]
D [validate_op_io_types:165]Validate [POOL]
D [print_tensor:146]in(0) : id[ 102] vtl[1] const[0] shape[ 7, 8, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [print_tensor:146]out(0): id[ 103] vtl[1] const[0] shape[ 3, 4, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [setup_node:441]Setup node id[62] uid[63] op[RESHAPE]
D [print_tensor:146]in(0) : id[ 103] vtl[1] const[0] shape[ 3, 4, 128, 1     ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [print_tensor:146]out(0): id[ 104] vtl[1] const[0] shape[ 1, 1, 1536, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [setup_node:441]Setup node id[63] uid[64] op[L2_NORMALIZE]
D [validate_op_io_types:165]Validate [L2_NORMALIZE]
D [print_tensor:146]in(0) : id[ 104] vtl[1] const[0] shape[ 1, 1, 1536, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.811229]
D [print_tensor:146]out(0): id[   0] vtl[0] const[0] shape[ 1, 1, 1536, 1    ] fmt[u8 ] qnt[ASM zp=  0, scale=0.000651]
D [optimize_node:385]Backward optimize neural network
D [op_optimize:113]Optimize RESHAPE, uid 63
D [op_optimize:113]Optimize RESHAPE, uid 11
D [optimize_node:392]Forward optimize neural network
D [op_optimize:113]Optimize RESHAPE, uid 11
D [op_optimize:113]Optimize RESHAPE, uid 63
I [compute_node:327]Create vx node
D [compute_node:350]Instance node[0] "POOL" ...
D [compute_node:350]Instance node[1] "CONV2D" ...
D [compute_node:350]Instance node[2] "RELU" ...
D [compute_node:350]Instance node[3] "POOL" ...
D [compute_node:350]Instance node[4] "CONV2D" ...
D [compute_node:350]Instance node[5] "RELU" ...
D [compute_node:350]Instance node[6] "POOL" ...
D [compute_node:350]Instance node[7] "CONV2D" ...
D [compute_node:350]Instance node[8] "RELU" ...
D [compute_node:350]Instance node[9] "POOL" ...
D [compute_node:350]Instance node[10] "RESHAPE" ...
D [compute_node:350]Instance node[11] "FCL2" ...
D [compute_node:350]Instance node[12] "RELU" ...
D [compute_node:350]Instance node[13] "FCL2" ...
D [compute_node:350]Instance node[14] "SPATIAL_TRANSFORMER" ...
Kernel "com.vivantecorp.extension.vxcTransform_setupThres_F16toF16" does not existI [vsi_nn_RegisterClientKernelAndNewNode:421]Register client kernel com.vivantecorp.extension.vxcTransform_setupThres_F16toF16 successfully.
Kernel "com.vivantecorp.extension.vxcTransform_Gemm_F16toF16" does not existI [vsi_nn_RegisterClientKernelAndNewNode:421]Register client kernel com.vivantecorp.extension.vxcTransform_Gemm_F16toF16 successfully.
Kernel "com.vivantecorp.extension.vxcTransform_InterP_F16toF16" does not existI [vsi_nn_RegisterClientKernelAndNewNode:421]Register client kernel com.vivantecorp.extension.vxcTransform_InterP_F16toF16 successfully.
D [compute_node:350]Instance node[15] "CONV2D" ...
D [compute_node:350]Instance node[16] "RELU" ...
D [compute_node:350]Instance node[17] "CONV2D" ...
D [compute_node:350]Instance node[18] "RELU" ...
D [compute_node:350]Instance node[19] "CONV2D" ...
D [compute_node:350]Instance node[20] "CONV2D" ...
D [compute_node:350]Instance node[21] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[22] "RELU" ...
D [compute_node:350]Instance node[23] "CONV2D" ...
D [compute_node:350]Instance node[24] "RELU" ...
D [compute_node:350]Instance node[25] "CONV2D" ...
D [compute_node:350]Instance node[26] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[27] "RELU" ...
D [compute_node:350]Instance node[28] "CONV2D" ...
D [compute_node:350]Instance node[29] "RELU" ...
D [compute_node:350]Instance node[30] "CONV2D" ...
D [compute_node:350]Instance node[31] "CONV2D" ...
D [compute_node:350]Instance node[32] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[33] "RELU" ...
D [compute_node:350]Instance node[34] "CONV2D" ...
D [compute_node:350]Instance node[35] "RELU" ...
D [compute_node:350]Instance node[36] "CONV2D" ...
D [compute_node:350]Instance node[37] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[38] "RELU" ...
D [compute_node:350]Instance node[41] "CONV2D" ...
D [compute_node:350]Instance node[39] "CONV2D" ...
D [compute_node:350]Instance node[40] "RELU" ...
D [compute_node:350]Instance node[42] "CONV2D" ...
D [compute_node:350]Instance node[43] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[44] "RELU" ...
D [compute_node:350]Instance node[45] "CONV2D" ...
D [compute_node:350]Instance node[46] "RELU" ...
D [compute_node:350]Instance node[47] "CONV2D" ...
D [compute_node:350]Instance node[48] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[49] "RELU" ...
D [compute_node:350]Instance node[52] "CONV2D" ...
D [compute_node:350]Instance node[50] "CONV2D" ...
D [compute_node:350]Instance node[51] "RELU" ...
D [compute_node:350]Instance node[53] "CONV2D" ...
D [compute_node:350]Instance node[54] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[55] "RELU" ...
D [compute_node:350]Instance node[56] "CONV2D" ...
D [compute_node:350]Instance node[57] "RELU" ...
D [compute_node:350]Instance node[58] "CONV2D" ...
D [compute_node:350]Instance node[59] "ADD" ...
D [vsi_nn_kernel_selector:970]Instance OPENVX node with kernel "add" 
D [compute_node:350]Instance node[60] "RELU" ...
D [compute_node:350]Instance node[61] "POOL" ...
D [compute_node:350]Instance node[62] "RESHAPE" ...
D [compute_node:350]Instance node[63] "L2_NORMALIZE" ...
---------------------------Begin VerifyTiling -------------------------
sky-fun commented 3 years ago

@nightingalei ok, now i can see the "Register client kernel com.vivantecorp.extension.vxcTransform_setupThres_F16toF16 successfully", thanks; the unexpected result may comes from somewhere else;

nightingalei commented 3 years ago

@sky-fun The network seems to be success. Could you please make a unit test for SpatialTransformer from Tengine and make sure it's correct? The SpatialTransformer may have different implementations from different NN engine, TIM-VX uses this link(https://github.com/christopher5106/last_caffe_with_stn/blob/master/src/caffe/layers/st_layer.cpp).

sky-fun commented 3 years ago

@nightingalei sure, i will work on this to find the reason why i got wrong results。

sky-fun commented 3 years ago

@nightingalei , after check out the mxnet version SpatialTransformer , i find that there is some difference in the SpatialTransformer implementation: the order of GridData in vsi_nn_op_spatial_transformer.c is (y,x,1), the order of gemm result is (y,x); while in our mxnet model, GridData is (x,y,1), gemm result is (x,y); so after i manually change the theta data from (t0,t1,t2,t3,t4,t5) to (t4,t3,t5,t1,t0,t2), the SpatialTransformer result is close to the origin model.

thezha commented 3 years ago

@sky-fun Could you clarify, did you modify the implementation in tim-vx or the implementation in mxnet?

sky-fun commented 3 years ago

@thezha In fact, i add tim::vx::ops::Split and tim::vx::ops::Concat to change the order of second input data(the theta data) before i use the tim::vx::ops::SpatialTransformer. So i modify the implementation where tengine-lite add tim-vx node.

thezha commented 3 years ago

@sky-fun It's great that you can work around this. Let us know if you see any performance bottleneck caused by the additional channel shuffle operations.

sky-fun commented 3 years ago

@thezha i found that there is still some difference between the mxnet version and the caffe version: The grid data in mxnet is coordinate divide (W-1)and(H-1) (https://github.com/apache/incubator-mxnet/blob/8fd17cef2ee854239c6e66f2dd3b9467aa222f79/src/operator/spatial_transformer-inl.h#L99), while caffe version is coordinate divide W and H (https://github.com/christopher5106/last_caffe_with_stn/blob/037482abdfa801f9c606ae878ed8cd3dd2bec675/src/caffe/layers/st_layer.cpp#L118); And of course they are different in interp step when they transform gemm result to origin coordinate;

So is there anything i can do to fix this deviation?

thezha commented 3 years ago

In the next version, we will add align_corners attribute to spatial transform operation. It will be updated end of this month. Thanks!

rniranjan93 commented 3 years ago

I have a .tflite file , I want to build a graph with Delegate class and want to return the graph using the following method std::shared_ptr<tim::vx::Graph>& GetGraph() { return graph_; } could you provide an example , i.e, taking .tflite as input -> returning a builded tim::vx::Graph

sunshinemyson commented 3 years ago

I have a .tflite file , I want to build a graph with Delegate class and want to return the graph using the following method std::shared_ptr<tim::vx::Graph>& GetGraph() { return graph_; } could you provide an example , i.e, taking .tflite as input -> returning a builded tim::vx::Graph

What's the purpose for this usage? It's quite strange to me if you want use tflite interpreter to extract model but don't want to inference it with tflite interpreter.

thezha commented 3 years ago

@sky-fun align_corners option is added to SpatialTransformer 6a949bb315b7d711cf6f73056f1022ae23d32579

sky-fun commented 3 years ago

@thezha thanks