Continual backpropagation

tlaurie99 / reinforcement_learning

1 stars 0 forks source link

Continual backpropagation #1

Open tlaurie99 opened 2 weeks ago

tlaurie99 commented 2 weeks ago

Use of continual backpropagation networks to maintain plasticity of neural networks

This is done by:

Maintaining a contribution utility metric that is maintained for each neurons of the network
The neurons that contribute the least to a given state and are near 0 is recorded via index
If this node continues on this list for a certain number of SGD updates then it is considered "dead" in which it gets added to a pool of dead neurons
Every 1 / ~200 SGD updates will select one of these neurons and reinitialize it
Initialize the output weights of the neuron to 0 (so that it does not change the current output)
Reinitialize the input weights via the prior distribution (using something like Xavier or Kaiming)

tlaurie99 commented 2 weeks ago

8/29

[x] Begin exploration of exposing weights and keeping track of dead neurons

tlaurie99 commented 2 weeks ago

8/30

Weights have been exposed, iterated through and summed over each layer. Running into issues with getting individual neuron outputs from the l layer into the l+1 layer

[x] Further exploration on how to expose the outputs of individual neuron outputs

So far, it looks like a forward hook idea is needed

tlaurie99 commented 1 week ago

9/3 - 9/6

Dropped using RLLIB modules and create layers from scratch using torch -- allows easy access to outputs and access to weights. Utility function is currently being calculated given a layer output (mini_batch_size, designated layer size)

[x] Implement threshold limit / updating and tracking w.r.t this
[x] Implement weight reinitialization

Not sure how to reinit the biases for the neuron -- attempted to do it as torch does here, but having dimension issues

The above was due to only have a 1-D tensor (128x1) and calculate fan in / fan out is based on the tensor input and output weights (2-D) -- changed to using uniform which only requires a 1-D tensor

[ ] Abstract methods away into a module for easy use

tlaurie99 commented 5 days ago

9/9 - 9/13

-> After seeing how the theory works, decided to use their github and changed the cbp file

[ ] See the lop architecture and see how to properly pass rllib network modules to it
[ ] Pass networks directly to cbp