Open ehknight opened 7 years ago
You're perfectly correct, current implementation is " just some q-learning on a convolutional neural network with experience replay and target networks". I'll add an implementation of original DQN or prioritized experience replay one within next two weeks.
It would be really nice if we could have an example that re-implements the original DQN paper exactly. The Ms. PacMan one is pretty close but AFAIK it has some subtle differences, such as window augmentation (let me know if I'm wrong on this and I'll close the issue). This would be really helpful so that people can see how the AgentNet syntax relates to other methods that implement the exact same thing.