PKU-Alignment / omnisafe

JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
https://www.omnisafe.ai
Apache License 2.0
936 stars 132 forks source link

[Question] Cost Loss and Update for SAC Lagrangian #87

Closed konwook closed 1 year ago

konwook commented 1 year ago

Required prerequisites

Questions

  1. Why does the backup for the cost critic loss assign data['rew'] instead of data['cost'] to cost? Wouldn't this update result in a cost critic identical to the standard value critic?

  2. The initial update for the Lagrange multiplier uses Jc = data['cost'].sum().item(). However, the update_lagrange_multiplier method uses Jc to compute the lambda loss which has function signature: def compute_lambda_loss(self, mean_ep_cost): Shouldn't Jc be defined as Jc = data['cost'].mean().item() if it's the mean_ep_cost?

Gaiejj commented 1 year ago

You are very attentive and thank you very much for your suggestions. We have fixed this issue on the dev branch. If you find a new issue in subsequent use, we will fix it as soon as possible.