tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.44k stars 1.92k forks source link

Common patterns that need to be fused #5458

Open qjia7 opened 3 years ago

qjia7 commented 3 years ago

In TFJS, there are already some fused ops, like fusedConv2d, fusedDepthwiseConv2d, fusedMatMul, which can greatly improve the performance. However, there are many other patterns which are frequently used in many models, but not fused. We'd like use this bug to track all such kind of patterns to see if there are any possibilities to fuse them in TFJS for better performance.

We raise this issue is that TFJS has included webgpu backend. It's more powerful than webgl. Due to tfjs webgpu is based on compute shader rather than fragment shader. It's more flexible to randomly access any position and write to any position. It provides convenience/possibilities to fuse any ops combination. Even a new fused pattern is hard to implement in some backends. For those backends, it's still easy to break down the fused ops into individual ops to execute them.

qjia7 commented 3 years ago

@pyu10055 @lina128 @jinjingforever Please help provide some common patterns that you already know. Thanks.

pyu10055 commented 3 years ago
  1. separable conv2d which is part of the mobilenetv2 residual and linear bottleneck layer.
    def bottleneck_block(x, expand=64, squeeze=16):
    m = Conv2D(expand, (1,1))(x)
    m = BatchNormalization()(m)
    m = Activation('relu6')(m)
    m = DepthwiseConv2D((3,3))(m)
    m = BatchNormalization()(m)
    m = Activation('relu6')(m)
    m = Conv2D(squeeze, (1,1))(m)
    m = BatchNormalization()(m)
    return Add()([m, x])
    1. Fuse activation with Pooling ops
    2. Fuse mobilenet v3 h-swish activation
    3. fuse min + max => clip_by_value
gaikwadrahul8 commented 1 year ago

Hi, @qjia7

Apologize for the delayed response and we are re-visiting our older feature requests and checking whether those feature requests implemented or not as of now and also refer above comment from @pyu10055 so May I know are you still looking for this feature or are you working on this feature request in TFJs please ?

If someone wants to contribute for this feature then you're always welcome and please feel free to do and please refer these links Ref-1Ref-2 . Thank you!

qjia7 commented 1 year ago

Assign this to me. I will look at this issue some time this year.