tlaurie99 / reinforcement_learning

1 stars 0 forks source link

Continual backpropagation #1

Open tlaurie99 opened 2 weeks ago

tlaurie99 commented 2 weeks ago

Use of continual backpropagation networks to maintain plasticity of neural networks

This is done by:

  1. Maintaining a contribution utility metric that is maintained for each neurons of the network
  2. The neurons that contribute the least to a given state and are near 0 is recorded via index
  3. If this node continues on this list for a certain number of SGD updates then it is considered "dead" in which it gets added to a pool of dead neurons
  4. Every 1 / ~200 SGD updates will select one of these neurons and reinitialize it
  5. Initialize the output weights of the neuron to 0 (so that it does not change the current output)
  6. Reinitialize the input weights via the prior distribution (using something like Xavier or Kaiming)
tlaurie99 commented 2 weeks ago

8/29

tlaurie99 commented 2 weeks ago

8/30

Weights have been exposed, iterated through and summed over each layer. Running into issues with getting individual neuron outputs from the l layer into the l+1 layer

So far, it looks like a forward hook idea is needed

tlaurie99 commented 1 week ago

9/3 - 9/6

Dropped using RLLIB modules and create layers from scratch using torch -- allows easy access to outputs and access to weights. Utility function is currently being calculated given a layer output (mini_batch_size, designated layer size)

Not sure how to reinit the biases for the neuron -- attempted to do it as torch does here, but having dimension issues

The above was due to only have a 1-D tensor (128x1) and calculate fan in / fan out is based on the tensor input and output weights (2-D) -- changed to using uniform which only requires a 1-D tensor

tlaurie99 commented 5 days ago

9/9 - 9/13

-> After seeing how the theory works, decided to use their github and changed the cbp file