Closed grandpahao closed 1 year ago
Hi @grandpahao,
I use the original safety-starter-agents as the on-policy baselines in the paper, see the third paragraph here: https://github.com/liuzuxin/cvpo-safe-rl#notes-and-acknowledgments. Recently I found another repo contains the implementations of PPO-L, TRPO-L, and CPO, see https://github.com/PKU-MARL/Safe-Policy-Optimization. Thanks.
A lot of thanks for your answer~
Hi~ I do not find codes of on-policy algorithms in this repo. Do you have any examples of PPO with lagrangian multipliers?