Closed wolviey closed 2 years ago
Thank you very much for reporting the bug! As we are heading towards version 1.0, this is the proper time to report bugs.
You are correct: https://towardsdatascience.com/sigmoid-neuron-deep-neural-networks-a4cd35b629d7 https://en.wikipedia.org/wiki/Sigmoid_function https://beckernick.github.io/sigmoid-derivative-neural-network/
you are welcome. using sigmoid not so good idea in these days anyway, even unfortunately which use sigmoid with your library will act worse then before , by the way thanks for good work. really great library.
sorry probably better to close with commit.
You are correct. This implementation is behaving like a Leaky ReLU. I need to look further into this.
My models converge better with the error...
yes definitely your model will work better like i said, sigmoid function generally good for binary based outputs. otherwise for normalization and error rates and gradients your model will work much better.
but in that case i think better to compare your activator with leaky relu and check for how much better. even your activation function and leaky relu looks similiar, there is little difference . in my humble opinion leaky relu need to be work little better, and if you compare arithmetic cost of these functions, leaky relu guess less expensive.
maybe you can keep both activators with different names. so people not mix them also you can keep your activator standard in that way and people not confuse. also better for compare with other frameworks without cheating :P
It turns our that this error made this function to work like swish function... I'll fix this.
Added TNNetSwish in #65 .
probably it dont matter so much how activation function but current sigmoid work as relu , and if need to be work as relu why need exp function in it?
function Sigmoid(x: TNeuralFloat): TNeuralFloat; begin Result := x / ( 1 + Exp(-x) ); end;
right sigmoid function need to be like that i guess,
function Sigmoid(x: TNeuralFloat): TNeuralFloat; begin Result := 1 / ( 1 + Exp(-x) ); end;
PS: probably current activation of sigmoid works so much better then right sigmoid formula, looks like leaky relu