webmachinelearning / webnn

🧠 Web Neural Network API
https://www.w3.org/TR/webnn/
Other
368 stars 45 forks source link

Need to restrict the value of alpha to be positive for elu operation #383

Open lisa0314 opened 1 year ago

lisa0314 commented 1 year ago

According to the elu definition of wikipedia and the paper, the alpha should be positive.

This issue was raised by @huningxin in WebNN Chromium CL review. Thanks Ningxin!

huningxin commented 1 year ago

Thanks for opening this issue, @lisa0314 !

The quick survey of the native ML APIs support:

/cc @wacky6 @fdwr

fdwr commented 1 year ago

Hmm, ML API's typically don't restrict floating point inputs (more often they reject invalid integer quantities like bad axes/sizes), and for the sake of broader compat, we probably shouldn't unnecessarily reject values that work in other libraries. e.g.:

import torch

x = torch.tensor([-3, 3], dtype = torch.float32)
s = torch.nn.ELU(alpha = -1) # ✅ works, and even float('nan') is allowed.
y = s(x)

print("value:", y)
print("shape:", y.shape)
print("dtype:", y.dtype)

# value: tensor([0.9502, 3.0000])
# shape: torch.Size([2])
# dtype: torch.float32

The plot has a kink, but otherwise looks non-degenerate:

UPDATE: Fixed graph after Ningxin's comment image

If we rejected negative values, and someone was somehow using it for compat reasons, what would be the decomposition? I suppose you could fall back to elementwiseIf(greater(x, 0), x, scale * (exp(x) - 1)), now that elementwiseIf and greater are pending operators.

huningxin commented 1 year ago

@fdwr

When alpha is -1, the calculation becomes -1 * ( exp(x) - 1), and the plot would look like

if someone did need a negative ~alpha~ scale coefficient for negative inputs for whatever unusual compat reason, what would be the decomposition?

I think the decomposition sample in current spec still works for negative alpha, e.g. -1:

return builder.add(
          builder.max(builder.constant(0), x),
          builder.mul(
            builder.constant(-1), // alpha = -1
            builder.sub(
              builder.exp(builder.min(builder.constant(0), x)),
              builder.constant(1))));

(BTW, there is a typo in the elu sample code, I'll fix it.)

fdwr commented 1 year ago

@huningxin: Doh, fixed graph. 👍 Yep, you're right, because the scale multiply occurs after the min, the existing decomposition works fine.

I think elu should just return a result equivalent to its decomposition elementwiseIf(greater(x, 0), x, scale * (exp(x) - 1)), whether the alpha scale is positive or negative. Similarly for NaN's, I'd just follow standard IEEE behavior and propagate it through. We don't have special checks for NaN's with the other floating-point activation/elementwise operators, and consider that if scale was actually a tensor instead of a single scalar, we wouldn't bother to scan every value inside it.