Open FengChendian opened 1 year ago
I observed followings:
I believe the constant gradient provides stable and sufficient amount of gradient to facilitate stable and relatively fast learning of weights, so mpezeshki use SoftPlus instead.
@mpezeshki , can you verify and confirm on the explanation mentioned?
In the train function, your code is a softplus function. $$loss = ln(1 + e^x)$$ But in The Forward-Forward Algorithm: Some Preliminary Investigations, Hinton uses logistic function. $$p = \sigma(\Sigma y^2 - \theta)$$ Here, $$\sigma(x) = \frac{1}{1 + e^{-x}}$$
Is this a mistake or a better choice?