webmachinelearning / webnn

🧠 Web Neural Network API
https://www.w3.org/TR/webnn/
Other
389 stars 46 forks source link

Support coordinate transformation modes for Resample2d #270

Open Honry opened 2 years ago

Honry commented 2 years ago

Issue: DeepLabV3 contains Resize node in ONNX model and ResizeBilinear in TFLite model. It's corresponding op in WebNN is Resample2d. Resize node contains coordinate_transformation_mode=align_corners, ResizeBilinear node contains align_corners=true and half_pixel_centers=false. While WebNN doesn't support such coordinate transformation modes and its default behavior equals to align_corners=false and half_pixel_centers=true.

Compares all the behaviors in different backends and frameworks as follows:

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

TFLite ([half_pixel_centers, align_corners](https://www.tensorflow.org/mlir/tfl_ops#tflresize_bilinear_mlirtflresizebilinearop)) | ONNX ([coordinate_transformation_mode enum](https://github.com/onnx/onnx/blob/main/docs/Operators.md#resize)) | OpenVINO ([coordinate_transformation_mode enum](https://docs.openvino.ai/latest/openvino_docs_ops_image_Interpolate_4.html#doxid-openvino-docs-ops-image-interpolate-4)) | [DML ](https://docs.microsoft.com/en-us/windows/win32/api/directml/ns-directml-dml_resample1_operator_desc)| WebNN -- | -- | -- | -- | -- half_pixel_centers=true | half_pixel | half_pixel | Supported | Default behavior N/A | pytorch_half_pixel | pytorch_half_pixel | Supported | N/A half_pixel_centers=false align_corners=false | asymmetric | asymmetric | Supported | N/A N/A | N/A | tf_half_pixel_for_nn | N/A | N/A align_corners=true | align_corners | align_corners | Supported | N/A N/A | tf_crop_and_resize | N/A | Supported | N/A

Note: DML backend can support these modes by calculating its InputPixelOffsets and outputPixelOffsets members. You can refer to its implementation in ONNXRumtime DML backend.

Open: Can we support these coordinate transformation modes in WebNN? At least for asymmetric, half_pixel_centers and align_corners. Or maybe we could refer to DML by defining InputPixelOffsets and outputPixelOffsets options in Resample2d to implement various coordinate transformation modes.

fdwr commented 1 day ago

Most of these resampling transformations were mistakes (yes, I added them to DML for completeness using a generic transform approach, but I don't recommend propagating them into perpetuity). The correct approach from computer vision/imaging experts is to resample the center point of the pixels, not asymmetrically (which causes image shifting) nor as point samples (which causes misalignment). That approach (half pixel) is used by libraries like OpenCV, SciPy, Matlab. Recent versions of PyTorch and TF do the right thing too, and it's ONNX's default.

Also see: https://medium.com/hackernoon/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35