Closed hanbaoan123 closed 3 years ago
Hi,
You can refer to RLzoo to check the PG working for both discrete and continuous cases. The key point is to make action distribution from the policy a Gaussian for continuous case to replace the categorical distribution in discrete case, and derive the differentiable log-probability with it in the loss function. Additionally, if you simply want continuous PG-based algorithms, you can also check or use more advanced ones like PPO.
Best, Zihan
Despite the suggestions in tutorial_PG.py, I still don't know how to modify the code to apply to the continuous action problem. Could you please add an example to illustrate this?