EndingCredits / Neural-Episodic-Control

Implementation of Deepmind's Neural Episodic Control
57 stars 20 forks source link

Neural Episodic Control

This is my attempt at replicating DeepMind's Neural Episodic Control agent. It is currently set up for running with the ALE, but can easily be adapted for other environments (you may want to use my older implementation here as a reference).

To run the code (after installing all dependencies:

python main.py --rom [path/to/rom/file.bin]

Further options can be found using:

python main.py -h

There is currently only training, without any testing and saving or loading. Scores are reported per episode, which is once per life.

N.B: There are a number of differences between this implementation and the original paper:

Many thanks to all the authors whose code I've shamelessly ripped off, e.g. the knn-dictionary code and the environment wrapper (even though now they are probably unrecognisable). If you have a separate working implementation of NEC, I'd love to swap notes to see if I've made any errors or there are any good efficiency savings. Also, if you spot any (inevitable) bugs, please let me know.

Dependencies

You'll have to look up how to install these, but this project uses the following libraries:

Running the Unity demo

This is to train an agent to play the Roll-a-ball game from the Unity tutorial. Video of agent here: https://www.youtube.com/watch?v=6O93BOMFdUI

The Unity engine and the agent communicate by sending information to and from the running server. Observations from the engine are given in the form of list of relevant objects in the scene. Each object is turned into a feature vector encoding object class, position, and velocity.

N.B: There is a bug in the environment code which means that sometimes the environment doesn't reset properly. This shouldn't affect agent performance, but means the number of episodes is incorrectly reported.

TODO list:

Technical improvements:

Experiments: