Closed fdwr closed 10 months ago
Thanks for the details, especially the model list is so helpful , @fdwr !
One question, given WebNN only supports static tensor shape and would be used as framework backend, could the frameworks map the unsqueeze
(and squeeze
) to WebNN's reshape
operation?
On the other hand, unsqueeze seems not to be widely supported by native ML APIs, such as DirectML and NNAPI. Did I miss anything?
@huningxin Yep, squeeze/unsqueeze/flatten/reshape are all in the same category of generic reshaping operators that do not modify the actual data, just reinterpret the tensor dimensions, and they all support static shape scenarios.
On the other hand, unsqueeze seems not to be widely supported by native ML APIs, such as DirectML and NNAPI
There is no DML_OPERATOR
for these in the API because there is no need to actually execute any operations on the GPU, as they are just CPU-side tweaks of the dimensions (in the DML_BUFFER_TENSOR_DESC::Sizes
and DimensionCount
), and the same tensor data can be shared between both input/output tensors to avoid any copies. I see the NN API has an ANEURALNETWORKS_RESHAPE
operator which WebNN could use to directly implement squeeze/unsqueeze (or possibly via a call like ANeuralNetworksMemoryDesc_setDimensions
). I suppose a bigger question for whether WebNN should support operators like squeeze/unsqueeze is how high/low level of an API WebNN is, and how easy should it be to ingest/convert existing models. Either way, I see the two operators as a siblings, that if we include one, we ought to include the other, and if we exclude one, we exclude the other. Thanks.
Thanks @fdwr !
I suppose a bigger question for whether WebNN should support operators like squeeze/unsqueeze is how high/low level of an API WebNN is
There is a section of guidelines for new operations in WebNN spec that says "prefer primitives over new high level operations but consider performance consequences".
As your sharing of its functionality, unsqueeze
seems not an performance sensitive operator.
I see the two operators as a siblings, that if we include one, we ought to include the other, if we exclude one, we exclude the other.
+1.
Probably we should exclude both squeeze
and unsqueeze
.
Note I've implemented unsqueeze
locally, along with flattenTo2d
, but all three {squeeze
, unsqueeze
, flattenTo2d
} are really just variants of reshape
and are actually implemented in terms of it. So, calling frameworks could definitely get away with only having reshape
available:
Current thinking (per https://github.com/webmachinelearning/webnn/issues/375#issuecomment-1674224992) is now to remove squeeze
from WebNN (and to not add unsqueeze
or flattenTo2d
), just implementing them all in terms of reshape
, as libraries can trivially implement them with a few lines of code. Additionally libraries could have other "reshape" flavor ops we don't even know about. As a backend, WebNN doesn't offer any interesting performance-wise to include them, while it does add to backend complexity and testing.
FYI @wacky6. "Remove squeeze sgtm :)"
Can this be closed off now that #478 is merged?
The squeeze operator is present, but its counterpart unsqueeze is missing (see ONNX / PyTorch / TF nearest equivalent), which ought to be included for parity with squeeze and because many popular models use it: