PKU-Alignment / omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.
https://www.omnisafe.ai
Apache License 2.0
901 stars 129 forks source link

[Question] Where is CVPO? #311

Closed JiwenJ closed 1 month ago

JiwenJ commented 5 months ago

Required prerequisites

Questions

Hello, I believe there was a CVPO implementation in issue #57. I'm curious as to why it was removed.

Gaiejj commented 5 months ago

As previously discussed, we found that CVPO needs algorithmic adjustments to the environment's configuration (i.e., removing the randomness of the environment layer). We conducted detailed experiments and found that CVPO's performance on random layers was unsatisfactory.

Relevant experimental evidence has also been disclosed in the SafeRL research. You can refer to the paper published at ICLR 2024: Off-Policy Primal-Dual Safe Reinforcement Learning.

image

Anyway, we are currently trying to upload the requirements for customizing this algorithm's environment further to Safety-Gymnasium modifications to facilitate community use.

If you have any comments or ideas, feel free to discuss them further.

Gaiejj commented 1 month ago

Since there has been no response for a long time, we will close this issue. Please feel free to reopen it if you encounter any new problems!