david8862 / keras-YOLOv3-model-set

end-to-end YOLOv4/v3/v2 object detection pipeline, implemented on tf.keras with different technologies
MIT License
640 stars 222 forks source link

lambda layers and portability to tfjs #19

Open gillmac13 opened 4 years ago

gillmac13 commented 4 years ago

Hi @david8862 ,

Part of my experiments is porting the models to tfjs, my final application being in JavaScript. I have found that the "pythonic" lambda layers cannot be converted by tensorflowjs_converter because they cannot be interpreted in JavaScript. This is a concern for the two Shufflenet backbones and the YoloNano structure. In the the latter case, I have tried to replace the lambda layer in the FCA block (yolo3_nano.py) by pure Keras layers... This is my tentative FCA block, same as yours, but with an ugly hack:

from tensorflow.keras.layers import Reshape

def _fca_block(inputs, reduct_ratio, block_id):
    in_channels = inputs.shape.as_list()[-1]
    in_shapes = inputs.shape.as_list()[1:3]
    reduct_channels = int(in_channels // reduct_ratio)
    prefix = 'fca_block_{}_'.format(block_id)
    x = GlobalAveragePooling2D(name=prefix + 'average_pooling')(inputs)
    x = Dense(reduct_channels, activation='relu', name=prefix + 'fc1')(x)
    x = Dense(in_channels, activation='sigmoid', name=prefix + 'fc2')(x)

    x = Reshape((1,1,in_channels),name='reshape')(x)
    if in_shapes != [None,None]:
        x = UpSampling2D(in_shapes, name=prefix + 'upsample')(x)

    x = Multiply(name=prefix + 'multiply')([x, inputs])

    return x

It works well, the models are good, tensorflowjs_converter does the job, and I can load the models in JavaScript. But I'm not sure if I have respected the purpose of the FCA block. Would you have an idea ?

Gilles

david8862 commented 4 years ago

Hi @gillmac13. It's a good solution and could also match YOLO Nano design. I'm just aware that the UpSampling2D is unnecessary since the attention factor could automatically broadcast to feature maps when doing Multiply. I can merge this implementation to code base. Many thanks for your suggestion~