Open muazhuda opened 1 year ago
Thanks for your question.
We have published guidelines for proposing new ops: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation
If you can answer those questions in this issue, the WG is able to look at your request sooner.
The gather
op has been already proposed in #375 and the WG is actively discussing it.
re: Scatter
Open questions for scatter:
re: Scatter
+1 to support scatter, in particular, scatterND
MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);
When prototyping Whisper decoders' inference with MLBuffer, I found scatterND
operator is useful to insert KV values by position into the pre-allocated static KV cache (WebNN requires static shape).
builder.scatterNd(past_key, position_ids, present_key);
builder.scatterNd(past_value, position_ids, present_value);
This avoids reading the KV cache tensor back to CPU and improves the performance. The initial prototype is available at: https://github.com/huningxin/onnxruntime-inference-examples/blob/whisper-mlbuffer/js/whisper-demo/whisper.js
Platforms' support:
updates
to data
at locations indices
. Support "update" and other modes.updates
according to individual values at the specified indices
. Only support "add"/"sum" mode. Need the emulation to support scattering to input tensor.
- TFLite scatter_nd: Scatter sparse
updates
according to individual values at the specifiedindices
. Only support "add"/"sum" mode. Need the emulation to support scattering to input tensor.
Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)?
The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both?
i.e. do we need both of these:
MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);
MLOperand scatterNd(sequence<unsigned long> shape, MLOperand indices, MLOperand updates);
... or can we get away with the former only?
@inexorabletash
Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)?
The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both?
I agree "update" into an input tensor is more commonly used, including the updating static KV cache use case and some other models, like SAM ViT Base.
TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.
... or can we get away with the former only?
+1 to former only. The latter can be emulated by updating into a zero initialized tensor.
TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.
Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops
TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.
Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops
Although scatter_nd_update
is not supported as tflite built-in ops, I am not sure whether it could be supported by Select TensorFlow operators feature. It is listed in the Supported Select TensorFlow operators.
/cc @reillyeon
We'll have to see what the binary size impact of adding support for these operators to Chromium's built-in copy of TFLite is, and also what level of support delegates have for them.
Is there reason why it is not in spec? tensorflow.js supports gather even for webgpu backend.