Open john9636 opened 3 years ago
DQN was provided as an example. Why was the policy gradient method not provided? Does it work on these problems?
Even we provide a set of benchmark solutions (agent_DQN.py, agent_MP.py), but you are encouraged to use any other RL or non-RL methods.
DQN was provided as an example. Why was the policy gradient method not provided? Does it work on these problems?