Version 0.9.6: Notebooks / Model Performance Evaluation

Version 0.9.0 is a near ready version of the repo. This will be benchmarking model performance, and showing basic repo use. One of the final major objects that are missing is the Interpretation object. The previous one was a bunch of spaghetti code fit for a high-end Italian restaurant. We need to simplify it for now. One of the most important things is allowing an interpretation object to merge with another interpretation object to compare reward graphs. The goal will be to do this in parallel with making the jupyter notebooks for this. We will also begin finalizing the README.

Goals:

[x] Jupyter Notebooks
- [x] DQN (ER, PER)
- [x] Fixed Target DQN (ER, PER)
- [x] Dueling DQN (ER, PER)
- [x] Double DQN (ER, PER)
- [x] DDDQN (ER, PER)
- [x] DDPG (ER, PER)
[x] Misc
- [x] Add Reward Metric
[x] Interpretation
- [x] Reward Logging (Cleaner)
- [x] Reward (value) overlay. So we can train 2 models, then compare their reward values.
- [x] Heat Mapping (Cleaner)
- [x] Q Value Estimation (Cleaner)
[x] README
- [x] Exhaustive benchmark (5 run average) of 2-3 environments per model. to have it somewhere not tied to github.

Edit 10/27/2019:

Added Interpreter and GroupInterpreter for easy reward logging Edit 12/14/2019:
README changes will be made for the 1.0 pull request since hopefully the code will be more consistent Edit 12/15/2019:
[x] Will push this to pypi I think so it is easier for people to download / install (?). Interested in testability.

josiahls / fast-reinforcement-learning

Version 0.9.6: Notebooks / Model Performance Evaluation #10