Code for running reinforcement-learning (RL) experiments on continuing (non-episodic) problems.
This repository contains code for (1) different RL algorithms, (2) some environments, and (3) the agent-env loop to run experiments with different parameters and multiple runs.
agents/
:
environments/
:
config_files/
: JSON files containing all the parameters required to run a particular experimentutils/
: various utilities and helper functionsexperiments.py
: contains the agent-environment interaction loopmain.py
: used to start an experiment based on the parameters specified in config_files
An example experiment can be run using:
python main.py --config-file='config_files/accesscontrol/test.json' --output-path='results/test_exp/'
Some basic plotting code is in plot_results_example.ipynb
.
The prediction algorithms can be run with linear function approximation (using tile coding (see Sutton & Barto (2018): Section 9.5.4)) and tabular representations (via a one-hot encoding).
The control algorithms can be run with tabular, linear, and non-linear function approximation. The non-linear algorithms are essentially Mnih et al.'s (2015), and Naik, Wan, Tomar, Sutton's (2024) DQN with reward centering and the differential version of DQN.
There is a single algorithmic implementation which results in the different algorithms with different parameter choices. For example, for the control algorithms, there is one implementation of a discounted algorithm with reward centering. Then:
The code in this repository can run most—if not all—experiments in the following works:
Note: Instead of maintaining multiple public repositories on github for all the different projects in my PhD, I created this single repository that can probably run every experiment in my dissertation.
However, I have not re-run all those experiments with this unified codebase.
If you are experiencing some unexpected results, feel free to reach out to me at abhisheknaik22296@gmail.com
and I will be happy to work those out with you :)