I am implementing a lot of reinforcement learning and imitation learning algorithms since I'm sick of reading about them but not really understanding them.
I need to make this code more modular and flexible, and take full advantage of Python's features. The policies right now are kind of hard-coded awkwardly. Look at modular_rl and see what I can change to make the policies better. We shouldn't have to make any special cases for continuous vs discrete actions spaces inside the actual VPG code, that should be handled by a policy class.
I need to make this code more modular and flexible, and take full advantage of Python's features. The policies right now are kind of hard-coded awkwardly. Look at
modular_rl
and see what I can change to make the policies better. We shouldn't have to make any special cases for continuous vs discrete actions spaces inside the actual VPG code, that should be handled by a policy class.