Closed thiagopbueno closed 5 years ago
LGTM.
The tests that are failing are related to de apply_gradients_fn
being changed, which I added an option for in another PR. It is interesting to see that in test_checkpoint
the action output is NaN. What might cause this?
Just added the grad_stats_fn to mapo_policy.py