Closed jyoungyun closed 2 months ago
@jyoungyun
I'm interested in supporting Relu6
.
If you don't mind can I work on this?
@jyoungyun I'm interested in supporting Add
if no one else is working on it atm.
I change transpose
-> pad
[ Linearize ] %125 = @57_Add(%121,%124)
[ Linearize ] %121 = @53_Add(%117,%120)
[ Linearize ] %117 = @49_Conv2D(%116, %58,%135)
[ Linearize ] %116 = @48_DepthwiseConv2D(%115, %25,%137)
[ Linearize ] %115 = @47_Conv2D(%114, %57,%137)
[ Linearize ] %114 = @46_Add(%110,%113)
...
[ Linearize ] %124 = @56_Conv2D(%123, %62, %26)
[ Linearize ] %123 = @55_DepthwiseConv2D(%122, %28,%136)
[ Linearize ] %122 = @54_Conv2D(%121, %61, %7)
The 57 Add
operator has two inputs, 121 and 124. In order to perform back-propagation properly, 56 Conv2D
should be calculated before 53 Add
operator. However, the current backwarding order does not consider this case. It makes an error in the loss value.
/cc @Aeren1564
In this graph, the output of 53 Add
is used in both 54 Conv2D
and 57 Add
operators. During back-propagation, both branches will be calculated with their gradient values. However, since there is only one 121 tensor, it is necessary to consider applying both gradient values to one tensor. Currently, the gradient calculated later overwrites the previous 121 gradient value. This makes an error in loss value.
/cc @ragmani
In my test environment,
Dataset: imagenet_a
== training parameter ==
- learning_rate = 0.001
- batch_size = 10
- loss_info = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer = adam
========================
step count per 1 epoch |
TensorFlow | onert_train(rel) |
---|---|---|
1 | 4.9956 | 5.0758 |
10 | 13.8550 | 55.3730 |
100 | 87.8630 | 539.7906 |
😮
If you need detailed information, please contact me. :)
Questions
rel
) means onert_train with build mode release
?IMHO, onert_train's result seems linearly increased. We would investigate tf's optimization points.
Even if there is some issue about the loss value, this model is training well in ONERT framework. I will close this issue and continue loss issues seperately. :)
Target model
MobileNetV2 model from tensorflow
model structure
model summary
``` Model: "mobilenetv2_1.00_224" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) [(None, 224, 224, 3)] 0 [] Conv1 (Conv2D) (None, 112, 112, 32) 864 ['input_1[0][0]'] bn_Conv1 (BatchNormalizati (None, 112, 112, 32) 128 ['Conv1[0][0]'] on) Conv1_relu (ReLU) (None, 112, 112, 32) 0 ['bn_Conv1[0][0]'] expanded_conv_depthwise (D (None, 112, 112, 32) 288 ['Conv1_relu[0][0]'] epthwiseConv2D) expanded_conv_depthwise_BN (None, 112, 112, 32) 128 ['expanded_conv_depthwise[0][0 (BatchNormalization) ]'] expanded_conv_depthwise_re (None, 112, 112, 32) 0 ['expanded_conv_depthwise_BN[0 lu (ReLU) ][0]'] expanded_conv_project (Con (None, 112, 112, 16) 512 ['expanded_conv_depthwise_relu v2D) [0][0]'] expanded_conv_project_BN ( (None, 112, 112, 16) 64 ['expanded_conv_project[0][0]' BatchNormalization) ] block_1_expand (Conv2D) (None, 112, 112, 96) 1536 ['expanded_conv_project_BN[0][ 0]'] block_1_expand_BN (BatchNo (None, 112, 112, 96) 384 ['block_1_expand[0][0]'] rmalization) block_1_expand_relu (ReLU) (None, 112, 112, 96) 0 ['block_1_expand_BN[0][0]'] block_1_pad (ZeroPadding2D (None, 113, 113, 96) 0 ['block_1_expand_relu[0][0]'] ) block_1_depthwise (Depthwi (None, 56, 56, 96) 864 ['block_1_pad[0][0]'] seConv2D) block_1_depthwise_BN (Batc (None, 56, 56, 96) 384 ['block_1_depthwise[0][0]'] hNormalization) block_1_depthwise_relu (Re (None, 56, 56, 96) 0 ['block_1_depthwise_BN[0][0]'] LU) block_1_project (Conv2D) (None, 56, 56, 24) 2304 ['block_1_depthwise_relu[0][0] '] block_1_project_BN (BatchN (None, 56, 56, 24) 96 ['block_1_project[0][0]'] ormalization) block_2_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_1_project_BN[0][0]'] block_2_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_2_expand[0][0]'] rmalization) block_2_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_2_expand_BN[0][0]'] block_2_depthwise (Depthwi (None, 56, 56, 144) 1296 ['block_2_expand_relu[0][0]'] seConv2D) block_2_depthwise_BN (Batc (None, 56, 56, 144) 576 ['block_2_depthwise[0][0]'] hNormalization) block_2_depthwise_relu (Re (None, 56, 56, 144) 0 ['block_2_depthwise_BN[0][0]'] LU) block_2_project (Conv2D) (None, 56, 56, 24) 3456 ['block_2_depthwise_relu[0][0] '] block_2_project_BN (BatchN (None, 56, 56, 24) 96 ['block_2_project[0][0]'] ormalization) block_2_add (Add) (None, 56, 56, 24) 0 ['block_1_project_BN[0][0]', 'block_2_project_BN[0][0]'] block_3_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_2_add[0][0]'] block_3_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_3_expand[0][0]'] rmalization) block_3_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_3_expand_BN[0][0]'] block_3_pad (ZeroPadding2D (None, 57, 57, 144) 0 ['block_3_expand_relu[0][0]'] ) block_3_depthwise (Depthwi (None, 28, 28, 144) 1296 ['block_3_pad[0][0]'] seConv2D) block_3_depthwise_BN (Batc (None, 28, 28, 144) 576 ['block_3_depthwise[0][0]'] hNormalization) block_3_depthwise_relu (Re (None, 28, 28, 144) 0 ['block_3_depthwise_BN[0][0]'] LU) block_3_project (Conv2D) (None, 28, 28, 32) 4608 ['block_3_depthwise_relu[0][0] '] block_3_project_BN (BatchN (None, 28, 28, 32) 128 ['block_3_project[0][0]'] ormalization) block_4_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_3_project_BN[0][0]'] block_4_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_4_expand[0][0]'] rmalization) block_4_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_4_expand_BN[0][0]'] block_4_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_4_expand_relu[0][0]'] seConv2D) block_4_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_4_depthwise[0][0]'] hNormalization) block_4_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_4_depthwise_BN[0][0]'] LU) block_4_project (Conv2D) (None, 28, 28, 32) 6144 ['block_4_depthwise_relu[0][0] '] block_4_project_BN (BatchN (None, 28, 28, 32) 128 ['block_4_project[0][0]'] ormalization) block_4_add (Add) (None, 28, 28, 32) 0 ['block_3_project_BN[0][0]', 'block_4_project_BN[0][0]'] block_5_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_4_add[0][0]'] block_5_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_5_expand[0][0]'] rmalization) block_5_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_5_expand_BN[0][0]'] block_5_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_5_expand_relu[0][0]'] seConv2D) block_5_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_5_depthwise[0][0]'] hNormalization) block_5_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_5_depthwise_BN[0][0]'] LU) block_5_project (Conv2D) (None, 28, 28, 32) 6144 ['block_5_depthwise_relu[0][0] '] block_5_project_BN (BatchN (None, 28, 28, 32) 128 ['block_5_project[0][0]'] ormalization) block_5_add (Add) (None, 28, 28, 32) 0 ['block_4_add[0][0]', 'block_5_project_BN[0][0]'] block_6_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_5_add[0][0]'] block_6_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_6_expand[0][0]'] rmalization) block_6_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_6_expand_BN[0][0]'] block_6_pad (ZeroPadding2D (None, 29, 29, 192) 0 ['block_6_expand_relu[0][0]'] ) block_6_depthwise (Depthwi (None, 14, 14, 192) 1728 ['block_6_pad[0][0]'] seConv2D) block_6_depthwise_BN (Batc (None, 14, 14, 192) 768 ['block_6_depthwise[0][0]'] hNormalization) block_6_depthwise_relu (Re (None, 14, 14, 192) 0 ['block_6_depthwise_BN[0][0]'] LU) block_6_project (Conv2D) (None, 14, 14, 64) 12288 ['block_6_depthwise_relu[0][0] '] block_6_project_BN (BatchN (None, 14, 14, 64) 256 ['block_6_project[0][0]'] ormalization) block_7_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_6_project_BN[0][0]'] block_7_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_7_expand[0][0]'] rmalization) block_7_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_7_expand_BN[0][0]'] block_7_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_7_expand_relu[0][0]'] seConv2D) block_7_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_7_depthwise[0][0]'] hNormalization) block_7_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_7_depthwise_BN[0][0]'] LU) block_7_project (Conv2D) (None, 14, 14, 64) 24576 ['block_7_depthwise_relu[0][0] '] block_7_project_BN (BatchN (None, 14, 14, 64) 256 ['block_7_project[0][0]'] ormalization) block_7_add (Add) (None, 14, 14, 64) 0 ['block_6_project_BN[0][0]', 'block_7_project_BN[0][0]'] block_8_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_7_add[0][0]'] block_8_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_8_expand[0][0]'] rmalization) block_8_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_8_expand_BN[0][0]'] block_8_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_8_expand_relu[0][0]'] seConv2D) block_8_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_8_depthwise[0][0]'] hNormalization) block_8_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_8_depthwise_BN[0][0]'] LU) block_8_project (Conv2D) (None, 14, 14, 64) 24576 ['block_8_depthwise_relu[0][0] '] block_8_project_BN (BatchN (None, 14, 14, 64) 256 ['block_8_project[0][0]'] ormalization) block_8_add (Add) (None, 14, 14, 64) 0 ['block_7_add[0][0]', 'block_8_project_BN[0][0]'] block_9_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_8_add[0][0]'] block_9_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_9_expand[0][0]'] rmalization) block_9_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_9_expand_BN[0][0]'] block_9_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_9_expand_relu[0][0]'] seConv2D) block_9_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_9_depthwise[0][0]'] hNormalization) block_9_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_9_depthwise_BN[0][0]'] LU) block_9_project (Conv2D) (None, 14, 14, 64) 24576 ['block_9_depthwise_relu[0][0] '] block_9_project_BN (BatchN (None, 14, 14, 64) 256 ['block_9_project[0][0]'] ormalization) block_9_add (Add) (None, 14, 14, 64) 0 ['block_8_add[0][0]', 'block_9_project_BN[0][0]'] block_10_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_9_add[0][0]'] block_10_expand_BN (BatchN (None, 14, 14, 384) 1536 ['block_10_expand[0][0]'] ormalization) block_10_expand_relu (ReLU (None, 14, 14, 384) 0 ['block_10_expand_BN[0][0]'] ) block_10_depthwise (Depthw (None, 14, 14, 384) 3456 ['block_10_expand_relu[0][0]'] iseConv2D) block_10_depthwise_BN (Bat (None, 14, 14, 384) 1536 ['block_10_depthwise[0][0]'] chNormalization) block_10_depthwise_relu (R (None, 14, 14, 384) 0 ['block_10_depthwise_BN[0][0]' eLU) ] block_10_project (Conv2D) (None, 14, 14, 96) 36864 ['block_10_depthwise_relu[0][0 ]'] block_10_project_BN (Batch (None, 14, 14, 96) 384 ['block_10_project[0][0]'] Normalization) block_11_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_10_project_BN[0][0]'] block_11_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_11_expand[0][0]'] ormalization) block_11_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_11_expand_BN[0][0]'] ) block_11_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_11_expand_relu[0][0]'] iseConv2D) block_11_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_11_depthwise[0][0]'] chNormalization) block_11_depthwise_relu (R (None, 14, 14, 576) 0 ['block_11_depthwise_BN[0][0]' eLU) ] block_11_project (Conv2D) (None, 14, 14, 96) 55296 ['block_11_depthwise_relu[0][0 ]'] block_11_project_BN (Batch (None, 14, 14, 96) 384 ['block_11_project[0][0]'] Normalization) block_11_add (Add) (None, 14, 14, 96) 0 ['block_10_project_BN[0][0]', 'block_11_project_BN[0][0]'] block_12_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_11_add[0][0]'] block_12_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_12_expand[0][0]'] ormalization) block_12_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_12_expand_BN[0][0]'] ) block_12_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_12_expand_relu[0][0]'] iseConv2D) block_12_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_12_depthwise[0][0]'] chNormalization) block_12_depthwise_relu (R (None, 14, 14, 576) 0 ['block_12_depthwise_BN[0][0]' eLU) ] block_12_project (Conv2D) (None, 14, 14, 96) 55296 ['block_12_depthwise_relu[0][0 ]'] block_12_project_BN (Batch (None, 14, 14, 96) 384 ['block_12_project[0][0]'] Normalization) block_12_add (Add) (None, 14, 14, 96) 0 ['block_11_add[0][0]', 'block_12_project_BN[0][0]'] block_13_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_12_add[0][0]'] block_13_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_13_expand[0][0]'] ormalization) block_13_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_13_expand_BN[0][0]'] ) block_13_pad (ZeroPadding2 (None, 15, 15, 576) 0 ['block_13_expand_relu[0][0]'] D) block_13_depthwise (Depthw (None, 7, 7, 576) 5184 ['block_13_pad[0][0]'] iseConv2D) block_13_depthwise_BN (Bat (None, 7, 7, 576) 2304 ['block_13_depthwise[0][0]'] chNormalization) block_13_depthwise_relu (R (None, 7, 7, 576) 0 ['block_13_depthwise_BN[0][0]' eLU) ] block_13_project (Conv2D) (None, 7, 7, 160) 92160 ['block_13_depthwise_relu[0][0 ]'] block_13_project_BN (Batch (None, 7, 7, 160) 640 ['block_13_project[0][0]'] Normalization) block_14_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_13_project_BN[0][0]'] block_14_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_14_expand[0][0]'] ormalization) block_14_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_14_expand_BN[0][0]'] ) block_14_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_14_expand_relu[0][0]'] iseConv2D) block_14_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_14_depthwise[0][0]'] chNormalization) block_14_depthwise_relu (R (None, 7, 7, 960) 0 ['block_14_depthwise_BN[0][0]' eLU) ] block_14_project (Conv2D) (None, 7, 7, 160) 153600 ['block_14_depthwise_relu[0][0 ]'] block_14_project_BN (Batch (None, 7, 7, 160) 640 ['block_14_project[0][0]'] Normalization) block_14_add (Add) (None, 7, 7, 160) 0 ['block_13_project_BN[0][0]', 'block_14_project_BN[0][0]'] block_15_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_14_add[0][0]'] block_15_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_15_expand[0][0]'] ormalization) block_15_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_15_expand_BN[0][0]'] ) block_15_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_15_expand_relu[0][0]'] iseConv2D) block_15_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_15_depthwise[0][0]'] chNormalization) block_15_depthwise_relu (R (None, 7, 7, 960) 0 ['block_15_depthwise_BN[0][0]' eLU) ] block_15_project (Conv2D) (None, 7, 7, 160) 153600 ['block_15_depthwise_relu[0][0 ]'] block_15_project_BN (Batch (None, 7, 7, 160) 640 ['block_15_project[0][0]'] Normalization) block_15_add (Add) (None, 7, 7, 160) 0 ['block_14_add[0][0]', 'block_15_project_BN[0][0]'] block_16_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_15_add[0][0]'] block_16_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_16_expand[0][0]'] ormalization) block_16_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_16_expand_BN[0][0]'] ) block_16_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_16_expand_relu[0][0]'] iseConv2D) block_16_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_16_depthwise[0][0]'] chNormalization) block_16_depthwise_relu (R (None, 7, 7, 960) 0 ['block_16_depthwise_BN[0][0]' eLU) ] block_16_project (Conv2D) (None, 7, 7, 320) 307200 ['block_16_depthwise_relu[0][0 ]'] block_16_project_BN (Batch (None, 7, 7, 320) 1280 ['block_16_project[0][0]'] Normalization) Conv_1 (Conv2D) (None, 7, 7, 1280) 409600 ['block_16_project_BN[0][0]'] Conv_1_bn (BatchNormalizat (None, 7, 7, 1280) 5120 ['Conv_1[0][0]'] ion) out_relu (ReLU) (None, 7, 7, 1280) 0 ['Conv_1_bn[0][0]'] global_average_pooling2d ( (None, 1280) 0 ['out_relu[0][0]'] GlobalAveragePooling2D) predictions (Dense) (None, 1000) 1281000 ['global_average_pooling2d[0][ 0]'] ================================================================================================== ```Files
Tensorflow
circle
onecc config
**onecc config** ``` [onecc] one-import-tf=True one-optimize=True [one-import-tf] model_format=saved_model input_path=. output_path=mobilenetv2.circle input_arrays=input input_shapes=1,224,224,3 output_arrays=predictions converter_version=v2 [one-optimize] input_path=mobilenetv2.circle output_path=mobilenetv2.opt.circle ```Todo
Operator
Conv2dReshapeFullyconnectedGenerate Dataset
Issues
Contribute this item together!