dqn, ddqn, drqn
reinforce, reinforce with baseline
ac, ac with target, a2c, a2c with target, a3c, sac
ppo, ippo, multidiscrete action ppo
dueling dqn
trpo
ddpg
custom env
Snake-0
Walker(BipedalWalker-v3 discrete version)
CartPole-v0
Pendulum-v1
BipedalWalker-v3
BipedalWalkerHardcore-v3