Mish gradient can be caculated as following code?

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.75k stars 7.96k forks source link

Open zhaokai5 opened 4 years ago

zhaokai5 commented 4 years ago

https://github.com/AlexeyAB/darknet/blob/05dee78fa3c41d92eb322d8d57fb065ddebc00b4/src/activations.c#L388 According to the formulation, the grad_sp should be caculated as follow; const float grad_sp = 1 - 1./(1+exp(sp)); but, your code is grad_sp = 1 - exp(-sp)，so this is a mistake?

YashasSamaga commented 4 years ago

sp is log(1 + exp(x))

grad_sp = 1 - exp(-sp)
        = 1 - exp(-log(1 + exp(x))
        = 1 - 1/(1 + exp(x))

I think you considered sp to be x. sp is the softplus output. The softplus gradient is a composition over softplus.

zhaokai5 commented 4 years ago

Thanks！It's my fault for considered the sp to be x.