Minimally-Cognitive Behavior

This code is a web-based implementation of a neural network-based agent in a simple environment based on 1996 and 2003 papers by Randall Beer. While his networks are trained using an evolutionary method, these networks are trained using a simple reinforcement learning scheme with discrete actions.

Some extensions I'm interested in:

Adding UI to permit creating a new agent, training an agent more, hand-tuning an agent, or setting hyper-parameters of an agent.
Making things a little more deterministic, particularly in a way that's tweakable via the UI.
Consider other ways of training agent besides policy gradients + RL. overview
- Cross-entropy method. papers on CEM: 0, 1, keras-rl CEM implementation
- Policy gradient with continuous action space. post 0
- TRPO? Q-learning? Actor-Critic?

cgc / rnd

Minimally-Cognitive Behavior #3