How to access parameter values for each time-step with callback function policy_param?

google / brax

Massively parallel rigidbody physics simulation on accelerator hardware.

Apache License 2.0

2.14k stars 234 forks source link

How to access parameter values for each time-step with callback function policy_param? #463

Closed blancamiller closed 3 months ago

blancamiller commented 4 months ago

I am using the provided Brax Training tutorial for PPO + Ant. The notebook provides a user-defined callback function progress. Similarly you can specify a policy_param function and pass it into train. This is where my issues lies. I’m able to get the parameter values for each episode, however, I’d like to access the parameters for each time-step. How can I go about doing that?

btaba commented 4 months ago

Hi @blancamiller , thanks for using brax! Could you please describe a bit more about your use-case? What do you mean by parameter values for each time-step? Parameters don't change during a rollout, they get updated over a batch of episodes.