Closed mazzamani closed 2 weeks ago
Hi, policy_module.apply
expects the params to be an iterable, not PPONetworkParams
. You can unpack params
via (params.policy, params.value)
Thanks for the answer. So, this worked for me:
inference_fn = self.make_inference_fn((self.params[0], self.params[1].policy))
My goal is to save intermediate policies. The policy at the end of training works fine for me:
However, when I call a
self.visualize_trajectory()
fromself.policy_params_callback
I run into the following error:
This is my visualization method which is adapted from the example training code:
Any idea why it is not working?
Edit: here is the whole script: