abhisheknaik96 / continuing-rl-exps

Code for running RL experiments on continuing (non-episodic) problems.
13 stars 6 forks source link

continuing-rl-exps

Code for running reinforcement-learning (RL) experiments on continuing (non-episodic) problems.

This repository contains code for (1) different RL algorithms, (2) some environments, and (3) the agent-env loop to run experiments with different parameters and multiple runs.

Code organization

An example experiment can be run using:

python main.py --config-file='config_files/accesscontrol/test.json' --output-path='results/test_exp/'

Some basic plotting code is in plot_results_example.ipynb.

Types of function approximation supported

The prediction algorithms can be run with linear function approximation (using tile coding (see Sutton & Barto (2018): Section 9.5.4)) and tabular representations (via a one-hot encoding).

The control algorithms can be run with tabular, linear, and non-linear function approximation. The non-linear algorithms are essentially Mnih et al.'s (2015), and Naik, Wan, Tomar, Sutton's (2024) DQN with reward centering and the differential version of DQN.

One algorithm implementation for different algorithms

There is a single algorithmic implementation which results in the different algorithms with different parameter choices. For example, for the control algorithms, there is one implementation of a discounted algorithm with reward centering. Then:


The code in this repository can run most—if not all—experiments in the following works:

Note: Instead of maintaining multiple public repositories on github for all the different projects in my PhD, I created this single repository that can probably run every experiment in my dissertation. However, I have not re-run all those experiments with this unified codebase. If you are experiencing some unexpected results, feel free to reach out to me at abhisheknaik22296@gmail.com and I will be happy to work those out with you :)