Closed weinman closed 2 years ago
I believe this and issue #15 are corrected in the branch update_210714 which has been pushed today. Please note that the branch update_210714 has a major change in the interface, and may break the scripts and hyperparameters that worked on the master branch.
I think there is a bug in the combination of the
forward
method of classdecolle.LenetDECOLLE
with thedecolle.utils.test
calculation.Specifically, when iterating over the layers the model determines whether it's at the final layer, and if so, applies a sigmoid to the membrane potential rather than the surrogate gradient function:
Because the dropout and readout functions of the final layer are the identity, the final readout output
r_out[-1]
offorward
has been passed through the sigmoid.When the
decolle.utils.test
method calculates the accuracies, it passes all of the final read-out values through a sigmoid, including the layer above, which has already been passed through a sigmoid.This seems to have a detrimental effect on the reported accuracies of the final layer.
(In theory, the sigmoid is monotonic, and since the follow-up call to
decolle.utils.prediction_mostcommon
uses theargmax
, it shouldn't matter, but since the sigmoid saturates for single-precision floating values, the argmax is likely making arbitrary choices between ties at1.00000
.)Assuming my interpretation is correct, I would gladly submit a PR to patch, but it seems there are several larger pieces at play that may preclude an obvious fix. Options include:
sigmoid
(as shown above) in thedecolle.LenetDECOLLE.forward
method to the identity; this would break the use of cross_entropy_one_hot in the output layer training loss function.n+1==len(net)
indecolle.utils.test
to avoid putting the readout through a second sigmoid in the last layer; I think this would "break" the behavior for nets where thewith_output_layer
is set toFalse
, because of this test.Thoughts?