PKU-Alignment / omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.
https://www.omnisafe.ai
Apache License 2.0
901 stars 129 forks source link

about P3O algorithms #320

Closed Eureka725 closed 4 months ago

Eureka725 commented 5 months ago

Required prerequisites

Questions

I didn't find any update process for kappa while I was learning the p3o algorithm. Is there an update process in /omnisafe/omnisafe/algorithms/on_policy/penalty_function p3o.py? kappa

Gaiejj commented 5 months ago

In the method description on page 3747, in the lower-left corner of the Penalized Proximal Policy Optimization for Safe Reinforcement Learning (P3O) paper, the author mentions two implementation methods of P3O and considers both methods to be very effective in implementation.

OmniSafe has adopted the second implementation method.