intel / webml-polyfill

Deprecated, the Web Neural Network Polyfill project has been moved to https://github.com/webmachinelearning/webnn-polyfill
Apache License 2.0
161 stars 46 forks source link

[PoseNet]WebML do not support Dilated Convolution #79

Closed yl495 closed 5 years ago

yl495 commented 6 years ago

Detail:
In PoseNet, it uses mobileNet model to predict human pose. For certain depthwise layer, it requires dilation. The code for dilated convolution in PoseNet is shown below:

function toOutputStridedLayers(
    convolutionDefinition: ConvolutionDefinition[],
    outputStride: OutputStride): Layer[] {
  // The currentStride variable keeps track of the output stride of
  // the activations, i.e., the running product of convolution
  // strides up to the current network layer. This allows us to
  // invoke atrous convolution whenever applying the next
  // convolution would result in the activations having output
  // stride larger than the target outputStride.
  let currentStride = 1;
  // The atrous convolution rate parameter.
  let rate = 1;

In WebML api, there is not any parameters could receive dilation parameter now.

Temporary solution
In WebGL, what we tried to solve this problem is using multiplier parameter in depthwise convolution to receive dilation rate, then apply multiplier to complete dilated convolution. The code is shown below:

 if (Array.isArray(dilation_rate)) {
      if (depthMultiplier !== 1) {
        this.dilationRate = dilation_rate.map(i => i * depthMultiplier);
      } else {
        this.dilationRate = dilation_rate;
      }
    } else {
      this.dilationRate = [dilation_rate, dilation_rate];
    }
huningxin commented 6 years ago

Thanks for reporting this issue.

Could you please help investigate the dilated convolution support in Android NN API, MacOS MPS and BNNS API?

yl495 commented 6 years ago

Yes, I read the layer information of Android NN API, MacOS MPS and BNNS API and found that NN API: do not support dilated convolution https://developer.android.com/ndk/reference/group/neural-networks#group___neural_networks_1ggaabbe492c60331b13038e39d4207940e0a34a73b5eaf458b67db5eda71557d1d01 MacOS MPS: support dilated convolution https://developer.apple.com/documentation/metalperformanceshaders/mpscnnconvolutiondescriptor BNNS API: do not support dilated convolution https://developer.apple.com/documentation/accelerate/bnnsconvolutionlayerparameters?changes=_1 But in PoseNet, dilation is used in depthwise convolution, in above three API, none of them could support dilation in depthwise convolution, I am going to examine if we could use convolution and pooling to replace depthwise.

huningxin commented 6 years ago

Thanks for the investigation. Your plan sounds good. Please investigate and give us update.

yl495 commented 5 years ago

I solved dilation convolution by using zero interpolation in weights data but this would slower run time a little bit, it is about 100ms.