google-deepmind / acme

A library of reinforcement learning components and agents
Apache License 2.0
3.52k stars 426 forks source link

Log rewards statistics in SAC agents. #232

Open wookayin opened 2 years ago

wookayin commented 2 years ago

PPO agents have reward_mean and rewards_std metrics logged, but SAC agents do not have.

Note that the SAC implementation is not so flexible that custom metrics cannot be configured or extended (because update_step is not a method), so it would be reasonable to add them directly into the update_step function.

wookayin commented 2 years ago

I'm submitting this as a separate PR other than #231 to ease the review process, but it'd be great if each of the PRs can be rebased/merged without creating a merge commit when being merged. Thanks!

wookayin commented 2 years ago

@qstanczyk, Thank you for the update on the PR after a long time! I understand DM may not have enough resources available, but as a general request, It'd be greatly appreciated if the turnaround time for community contributions could be reduced further.