Open oalieno opened 3 years ago
After my calculation, if we use 1 / (1 + np.exp(x))
as sigmoid function, then the derivative in the code is correct.
But if we want to use the regular sigmoid function 1 / (1 + np.exp(-x))
. Then we need to change the derivative to the following, which only differ by a minus sign comparing with original derivative.
g = (_dot_sigmoid(delta, tk.v_pred) - _identity(tk is sp_tk)) * context.alpha()
Both are correct and produce the same result. But I think using the regular sigmoid function is less confusing.
https://github.com/Lancern/asm2vec/blob/6e975f0b9cf573358d2f73c308aa043e26fb9a10/asm2vec/internal/training.py#L187-L188
should be