Closed axelbr closed 3 years ago
Hi, thanks for using garage and pointing out this issue.
The get_param_values
and set_param_values
are used to update the policy when training. It is true that the tf/policy interface dose not require this method. The policies in tf branch still work, because they also inherit from Module finally, and Module
has this method.
I think an quick way to fix this is to implement these two methods like this. In the get_param_values
method, return all the parameters defining your policy. And in the set_param_values
, resume these parameters.
May I ask which framework do you use in your custom policy, tensorflow or pytorch?
Thank you for the advice, i will try it out.
I am trying to get tf2 running with garage, so i have to use custom policies not relying on sessions. As far as i understood, garage isnt compatible with tf2, right?
The current primitives (policies, baselines/value_functions, q_funcitons) in garage/tf isn't compatible with tf2. But those out of tf is compatible with tf2, for example, trainer, sampler, replay_buffer.
@ryanjulian @krzentner What do you think?
Garage currently does work "using TF2" (with the exception of RL2-TRPO, which causes the TF2 backend to leak memory for unknown reasons), but uses the tf.compat.v1
namespace for everything. Unfortunately I don't think it's really possible to implement the garage.tf.models.Model.build
method using TF2 primitives. I could be wrong though, I'm not too familiar with TF2. It would be great to have TF2 support, but it hasn't been a priority since it looks like we would have to modify the garage.tf.algos
to be aware of eager execution, and pytorch has been relatively more popular.
It should be broadly possible to use the TF2 keras API to implement garage.tf.models.Model.build
, and if we did a TF2 port we would likely eliminate garage.tf.models.Model
in favor of Keras.
As maintainers, we do not oppose updating garage to use TF2 APIs, but we haven't prioritized it in our limited time.
The reasons we haven't prioritized TF2 support, in addition to what @krzentner mentions are:
Thank you for you elaborate answer!
I also believe, that most benefit comes from the provided infrastructure of garage. The samplers and training loops are really helpful. I can see now why tf2 support isn't you priority.
I think we can close this issue now. Thank you for the extensive responses!
When implementing a custom policy, i got the following error (trainer.py):
AttributeError: 'CustomGaussianMlpPolicy' object has no attribute 'get_param_values'
.However, the policy interface does not require this method to be implemented.