Closed pfeatherstone closed 3 years ago
Using native pytorch complex tensors, the activation should be:
z = torch.relu(torch.abs(z) + b) * torch.exp(1.j * torch.angle(z))
which can be re-written as:
z = z * torch.relu(1.0 + b / torch.abs(z))
so yeah, pretty sure that's a bug
This whole mod-relu activation looks very much much like soft-thresholding operator from ell-1 regularization. And I recall deciding that flipped sign parameterization would make more sense, since intuitively b
specifies the clipping threshold.
As such, this is not a bug, since the sign of b
is just flipped. So to clip the modulus at 0.5
you need to call modrelu(z, 0.5)
(current implementation) instead of modrelu(z, -0.5)
(as in the proposed bugfix). To me the second variant seemed back then and seems now highly counterintuitive. I guess i should've indicated this in the docstring, since it deviates form the original paper.
Maybe just some updated documentation would be sufficient then.
Great docs! Cheers
Currently the implementation is:
Shouldn't it be:
i.e. torch.relu(1. - threshold / modulus) -> torch.relu(1. + threshold / modulus)
?
The paper states:
not