webmachinelearning / webnn

🧠 Web Neural Network API
https://www.w3.org/TR/webnn/
Other
397 stars 48 forks source link

Divide-by-zero outcome should be standardized #691

Open huningxin opened 6 months ago

huningxin commented 6 months ago

This issue was raised by @wacky6 and @a-sully in Chromium CL review 5064994 and 5541389.

The discussion was about MLBatchNormalizationOptions.epsilon

epsilon, of type float, defaulting to 1e-5 A small value to prevent computational error due to divide-by-zero.

epsilon would be used by batchNormalization according to its calculation

Output = Scale * ((Input - Mean) / sqrt(Variance + Epsilon)) + Bias.

WebNN spec doesn't put any restriction on epsilon. It means if both variance and epsilon are 0, there will be divide-by-zero error.

epsilon is also used by instanceNormalization and layerNormalization, they may have similar division-by-zero error. The generic element-wise binary div should also be considered.

The proposal is to standardize divide-by-zero outcome. For example, as @fdwr mentioned, DirectML would give NaN if variance and epsilon are 0.

Refer to Wikipedia Division by Zero page for more information.

inexorabletash commented 6 months ago

WG telecon: @fdwr and @huningxin to investigate behavior across different backends, and get a concrete proposal.

Consider marking this as an https://github.com/webmachinelearning/webnn/labels/interop issue if we see meaningful differences across backends

fdwr commented 5 months ago

The generic element-wise binary div should also be considered.

For ordinary div, I'd expect:

Investigation ongoing across hardware... ⏳.