[Question] Where is CVPO?

JiwenJ commented 5 months ago

Required prerequisites

[X] I have read the documentation https://omnisafe.readthedocs.io.
[X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
[X] Consider asking first in a Discussion.

Questions

Hello, I believe there was a CVPO implementation in issue #57. I'm curious as to why it was removed.

Gaiejj commented 5 months ago

As previously discussed, we found that CVPO needs algorithmic adjustments to the environment's configuration (i.e., removing the randomness of the environment layer). We conducted detailed experiments and found that CVPO's performance on random layers was unsatisfactory.

Relevant experimental evidence has also been disclosed in the SafeRL research. You can refer to the paper published at ICLR 2024: Off-Policy Primal-Dual Safe Reinforcement Learning.

Anyway, we are currently trying to upload the requirements for customizing this algorithm's environment further to Safety-Gymnasium modifications to facilitate community use.

If you have any comments or ideas, feel free to discuss them further.

Gaiejj commented 1 month ago

Since there has been no response for a long time, we will close this issue. Please feel free to reopen it if you encounter any new problems!

PKU-Alignment / omnisafe

[Question] Where is CVPO? #311

Required prerequisites

Questions