JesusEV / nest-simulator

The NEST simulator
GNU General Public License v2.0
1 stars 0 forks source link

Implement early stopping #28

Open akorgor opened 3 months ago

akorgor commented 3 months ago

This PR replaces PR #2 and implements the early-stopping algorithm as described in the corresponding evidence accumulation task implemented in TensorFlow. The only difference is that here the early-stopping criterion is evaluated after each validation step ( e.g., every ten iterations) and not after every iteration as in the TensorFlow implementation. Since the early-stopping is assessed in the first instance with the newest validation value, evaluating the early-stopping for ten iterations with the same validation result as in the TensorFlow implementation seems wasteful.

Currently, the NEST losses match the TF losses for the iterations where there has not been a weight update yet; those are the first two iterations.

NEST:
0.74115255000619 validation
0.75886758496570 training
0.64575804540432 training
0.65036521347625 training
0.73954336799350 training
0.64857381599914 training
0.63293547882357 test
0.74871743652812 test
0.63259857933630 test
0.65171656508917 test
TF:
0.74115252494812 validation
0.75886762142181 training
0.64488172531128 training
0.63414341211319 training
0.74553966522217 training
0.65522724390030 training
0.62222802639008 test
0.75369793176651 test
0.63924121856689 test
0.65074861049652 test

To see if these deviations are due to an extra spike that results from numerical differences between TensorFlow and NEST, in the following experiment, a recurrent neuron (index 82) was forced to emit an extra spike at t = 4000, which is in the second iteration. This perturbation causes a deviation in the loss's 11th decimal digit.

NEST (perturbed):
0.74115255000619 validation
0.75886758494402 training
0.64732471934116 training
0.65337249687406 training
0.73455409141632 training
0.64412005064290 training
0.64022872139437 test
0.74554388366270 test
0.62969715145448 test
0.64486264066109 test

Therefore, probably the reason for the deviation between TF and NEST is that the gradients were not computed correctly.

github-actions[bot] commented 1 week ago

Pull request automatically marked stale!