webmachinelearning / webnn

🧠 Web Neural Network API
https://www.w3.org/TR/webnn/
Other
390 stars 46 forks source link

Where is scatter and gather op? #467

Open muazhuda opened 1 year ago

muazhuda commented 1 year ago

Is there reason why it is not in spec? tensorflow.js supports gather even for webgpu backend.

anssiko commented 1 year ago

Thanks for your question.

We have published guidelines for proposing new ops: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation

If you can answer those questions in this issue, the WG is able to look at your request sooner.

The gather op has been already proposed in #375 and the WG is actively discussing it.

inexorabletash commented 3 months ago

re: Scatter

Open questions for scatter:

huningxin commented 3 months ago

re: Scatter

+1 to support scatter, in particular, scatterND

MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);

When prototyping Whisper decoders' inference with MLBuffer, I found scatterND operator is useful to insert KV values by position into the pre-allocated static KV cache (WebNN requires static shape).

builder.scatterNd(past_key, position_ids, present_key);
builder.scatterNd(past_value, position_ids, present_value);

This avoids reading the KV cache tensor back to CPU and improves the performance. The initial prototype is available at: https://github.com/huningxin/onnxruntime-inference-examples/blob/whisper-mlbuffer/js/whisper-demo/whisper.js

Platforms' support:

inexorabletash commented 3 months ago
  • TFLite scatter_nd: Scatter sparse updates according to individual values at the specified indices. Only support "add"/"sum" mode. Need the emulation to support scattering to input tensor.

Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)?

The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both?

i.e. do we need both of these:

MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);
MLOperand scatterNd(sequence<unsigned long> shape, MLOperand indices, MLOperand updates);

... or can we get away with the former only?

huningxin commented 3 months ago

@inexorabletash

Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)?

The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both?

I agree "update" into an input tensor is more commonly used, including the updating static KV cache use case and some other models, like SAM ViT Base.

TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.

... or can we get away with the former only?

+1 to former only. The latter can be emulated by updating into a zero initialized tensor.

inexorabletash commented 3 months ago

TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.

Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops

huningxin commented 2 months ago

TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.

Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops

Although scatter_nd_update is not supported as tflite built-in ops, I am not sure whether it could be supported by Select TensorFlow operators feature. It is listed in the Supported Select TensorFlow operators.

/cc @reillyeon

reillyeon commented 2 months ago

We'll have to see what the binary size impact of adding support for these operators to Chromium's built-in copy of TFLite is, and also what level of support delegates have for them.