tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.43k stars 1.92k forks source link

MoveNet Optimizations? #6396

Closed Fluxionsdx closed 2 years ago

Fluxionsdx commented 2 years ago

I have the official TFJS MoveNet, modelType=SINGLEPOSE_THUNDER, and a custom TFJS model running simultaneously in the browser (chrome) on a 2021 MacBook M1. When running only one model at a time, both models run around 50 frames per second. However, when I run them together, the performance of Movenet only drops to 43 fps while the custom model drops to 15 fps. I'm wondering if there's some kind of optimization or clever design techniques that went into MoveNet that I'm not familiar with as I have no idea why this is happening. I would expect that both models would suffer a similar performance drop. The code for my custom model is written in python and then converted to TFJS and loaded into the Angular application with the loadGraphModel() function. It's a UNet style architecture which has a keras mobileNet for the downsampling pathway and simple Upsampling2D and depthwise separable convolutions for the upsampling pathway. I have pasted the model definition code below.

`

def mobile_uNet(num_classes, mn_size, pretrained, input_shape, layer_specs):

if pretrained and mn_size in [0.25, 0.5, 0.75, 1.0]:
    weights = 'imagenet'
else:
    weights = None
encoder = tf.keras.applications.MobileNet(input_shape=input_shape + (3,), alpha=mn_size, weights=weights, include_top=False)

for idx, layer in enumerate(layer_specs):
    if(idx == 0):
        up_input = encoder.get_layer('conv_pw_13_relu').output
    else:
        up_input = up_relu

    up_res = encoder.get_layer('conv_pw_{}_relu'.format(layer[1])).output
    up_upsample = layers.UpSampling2D(2)(up_input)

    # Handle mismatch rows - pad top
    if(up_upsample.shape[1] < up_res.shape[1]):
        up_upsample = layers.ZeroPadding2D(((1,0),(0,0)))(up_upsample)
    elif(up_upsample.shape[1] > up_res.shape[1]):
        up_res = layers.ZeroPadding2D(((1,0),(0,0)))(up_res)

    # Handle mismatch columns - pad left
    if(up_upsample.shape[2] < up_res.shape[2]):
        up_upsample = layers.ZeroPadding2D(((0,0),(1,0)))(up_upsample)
    elif(up_upsample.shape[2] > up_res.shape[2]):
        up_res = layers.ZeroPadding2D(((0,0),(1,0)))(up_res)

    up_concat = layers.Concatenate()([up_upsample, up_res]) 
    up_conv = layers.SeparableConv2D(layer[0] * mn_size, 3, padding='same')(up_concat)
    up_bn = layers.BatchNormalization()(up_conv)
    up_relu = layers.Activation(activations.relu)(up_bn)

final_upsample = layers.UpSampling2D(2)(up_relu)
final_conv = layers.SeparableConv2D(num_classes, 3, padding="same")(final_upsample)
final_bn = layers.BatchNormalization()(final_conv)
output = layers.Activation(activations.sigmoid)(final_bn)

model = keras.Model(encoder.input, output)
return model`
ahmedsabie commented 2 years ago

@Fluxionsdx Here is a blog post that describes the architecture, if you need more help then @ronnyvotel and @lina128 have written the blog post

Fluxionsdx commented 2 years ago

There were mistakes in my code. Please delete this issue. Sorry for wasting your time.

rthadur commented 2 years ago

Thank you