Closed digantamisra98 closed 4 years ago
Thank you for issue report.
I think Mish is a very interesting activation function. Added to the latest version(v0.15.1) of ruby-dnn.
class Mish < Layer
def forward(x)
@x = x
x * Xumo::NMath.tanh(Softplus.new.forward(x))
end
def backward(dy)
omega = 4 * (@x + 1) + 4 * Xumo::NMath.exp(2 * @x) + Xumo::NMath.exp(3 * @x) + Xumo::NMath.exp(@x) * (4 * @x + 6)
delta = 2 * Xumo::NMath.exp(@x) + Xumo::NMath.exp(2 * @x) + 2
dy * (Xumo::NMath.exp(@x) * omega) / delta**2
end
end
@unagiootoro Thank you for the consideration and providing the implementation!
New to Ruby. Only done web scraping using the same. But hopefully this can be considered. Nice work with the library though.
Mish is a new novel activation function proposed in this paper. It has shown promising results so far and has been adopted in several packages including:
All benchmarks, analysis and links to official package implementations can be found in this repository
It would be nice to have Mish as an option within the activation function group.
This is the comparison of Mish with other conventional activation functions in a SEResNet-50 for CIFAR-10: