unable to train NCP using gradient-based reinforcement learning

smolboii commented 3 years ago

i am currently trying to train an NCP architecture using a Q-Learning approach on the OpenAI gym CartPole-v0 environment. i'm running into problems however as the agent simply refuses to learn, or at least is taking far longer than a regular feed-forward neural network with Deep-Q-Learning. for reference i am using the pytorch implementation, and you can have a look at my code here if u would like.

i am not utilising the recurrent aspect of the NCP atm, however I don't think that's what's causing the problem as i tried a simple supervised learning test and that seemed to work fine. also i tried a regular feed-forward neural network in place of the NCP and that worked fine, and as mentioned it was fairly quick to achieve good results. in my code that's just the DQNetwork class.

is there some nuance to how i should approach this that i have missed, or have i perhaps made a silly error :P? thank u

mlech26l commented 3 years ago

Hi, NCPs tend to require more training iterations compared to feed-forward networks (has to do with a sparse/vanishing gradient). So you could try to apply several gradient updates instead of a single one, i.e., put lines 100-105 of agent.py in a loop.

Another thing you could try is to increase the connectivity (the fanin/fanout values), which should increase the capacity of the network.

smolboii commented 3 years ago

thanks for the reply :). yes it seems i just wasn't patient enough - after simply letting it train longer I am seeing the results I was expecting. thank u very much for the help :).

mlech26l / ncps

unable to train NCP using gradient-based reinforcement learning #14