emer / axon

Axon is a spiking, biologically-based neural model driven by predictive error-driven learning, for systems-level models of the brain
BSD 3-Clause "New" or "Revised" License
18 stars 8 forks source link

Add Eli & Rishi's correlation-based learning mechanism #17

Open rcoreilly opened 3 years ago

rcoreilly commented 3 years ago

http://arxiv.org/abs/2011.07334 -- the key equation is: (Var_i + Var_j - 2 Covar_ij) -- optimize variance of sender and receiver and minimize covariance between the two.

In my experiments, I updated the SWt (structural, spine, slow) weight in the slower outer-loop cycle as a function of accumulated Var and Covar stats (computed using simple running-average act - mean values) -- this produces a graded form of pruning-like function, because SWt multiplies the regular "fast" learned weights, so when it is reduced toward 0, it produces an effective "soft" form of pruning.

Having worked through the logic here better, I realized that I had an error in the initial implementation: missed the factor of 2 on Covar_ij and also that the pruning logic would make more sense to only include the negative component of this value -- otherwise we're getting a hebbian-like variance increasing force that is constantly working to increase the weights. That is not present in the pruning version.

rcoreilly commented 3 years ago

Looks like using both positive and negative at .1 learning rate works well in large-scale lvis object recognition model -- significantly reduces the strength of top-5 PCA components while driving solid "n strong" PCA components throughout learning. Still need to fix output layer dynamics, but decoding shows continued learning throughout!