webmachinelearning / webnn

🧠 Web Neural Network API
https://www.w3.org/TR/webnn/
Other
384 stars 46 forks source link

Unsqueeze operator is missing (so add unsqueeze, or better yet, delete squeeze) #296

Closed fdwr closed 10 months ago

fdwr commented 2 years ago

The squeeze operator is present, but its counterpart unsqueeze is missing (see ONNX / PyTorch / TF nearest equivalent), which ought to be included for parity with squeeze and because many popular models use it:

huningxin commented 2 years ago

Thanks for the details, especially the model list is so helpful , @fdwr !

One question, given WebNN only supports static tensor shape and would be used as framework backend, could the frameworks map the unsqueeze (and squeeze) to WebNN's reshape operation?

On the other hand, unsqueeze seems not to be widely supported by native ML APIs, such as DirectML and NNAPI. Did I miss anything?

fdwr commented 2 years ago

@huningxin Yep, squeeze/unsqueeze/flatten/reshape are all in the same category of generic reshaping operators that do not modify the actual data, just reinterpret the tensor dimensions, and they all support static shape scenarios.

On the other hand, unsqueeze seems not to be widely supported by native ML APIs, such as DirectML and NNAPI

There is no DML_OPERATOR for these in the API because there is no need to actually execute any operations on the GPU, as they are just CPU-side tweaks of the dimensions (in the DML_BUFFER_TENSOR_DESC::Sizes and DimensionCount), and the same tensor data can be shared between both input/output tensors to avoid any copies. I see the NN API has an ANEURALNETWORKS_RESHAPE operator which WebNN could use to directly implement squeeze/unsqueeze (or possibly via a call like ANeuralNetworksMemoryDesc_setDimensions). I suppose a bigger question for whether WebNN should support operators like squeeze/unsqueeze is how high/low level of an API WebNN is, and how easy should it be to ingest/convert existing models. Either way, I see the two operators as a siblings, that if we include one, we ought to include the other, and if we exclude one, we exclude the other. Thanks.

huningxin commented 2 years ago

Thanks @fdwr !

I suppose a bigger question for whether WebNN should support operators like squeeze/unsqueeze is how high/low level of an API WebNN is

There is a section of guidelines for new operations in WebNN spec that says "prefer primitives over new high level operations but consider performance consequences".

As your sharing of its functionality, unsqueeze seems not an performance sensitive operator.

I see the two operators as a siblings, that if we include one, we ought to include the other, if we exclude one, we exclude the other.

+1.

Probably we should exclude both squeeze and unsqueeze.

fdwr commented 1 year ago

Note I've implemented unsqueeze locally, along with flattenTo2d, but all three {squeeze, unsqueeze, flattenTo2d} are really just variants of reshape and are actually implemented in terms of it. So, calling frameworks could definitely get away with only having reshape available:

https://github.com/fdwr/chromium-src-webnn-dml/blob/a88e205c51edafb501ce4f7081e52605d1d518aa/third_party/blink/renderer/modules/ml/webnn/ml_graph_builder.idl#L202-L205

https://github.com/fdwr/chromium-src-webnn-dml/blob/aaa569fab061ee401a0e8897efa4cebd691a049e/third_party/blink/renderer/modules/ml/webnn/ml_graph_builder.cc#L2378-L2405

fdwr commented 1 year ago

Current thinking (per https://github.com/webmachinelearning/webnn/issues/375#issuecomment-1674224992) is now to remove squeeze from WebNN (and to not add unsqueeze or flattenTo2d), just implementing them all in terms of reshape, as libraries can trivially implement them with a few lines of code. Additionally libraries could have other "reshape" flavor ops we don't even know about. As a backend, WebNN doesn't offer any interesting performance-wise to include them, while it does add to backend complexity and testing.

FYI @wacky6. "Remove squeeze sgtm :)"

inexorabletash commented 10 months ago

Can this be closed off now that #478 is merged?