Open davidADSP opened 1 year ago
I can succesfully run the script on the current master, could you retry it or detail your ray and tf version?
Ray 2.3.1, Tensorflow 2.12.0
OK it looks like it's wrong in the docs, but not in the examples code
Specifically, this page https://docs.ray.io/en/latest/rllib/rllib-concepts.html
Thanks for raising this. We are deprecating the API in question and our TF1 support alltogether. I'm sorry this could not be resolved. If possible, please use another framework. We can close this issue as soon as RLModules and Trainers have trickled through the docs.
What happened + What you expected to happen
Running the script in https://github.com/ray-project/ray/blob/master/rllib/examples/custom_tf_policy.py results in an error (
use_critic=True
but the critic is not found and alsofrom_batch
is depreciated)The fix is to add
use_critic = False
like this:return compute_advantages(sample_batch, 0.0, policy.config["gamma"], use_gae=False, use_critic=False)
and also remove thefrom_batch
to give this insteadlogits, _ = model(train_batch)
Versions / Dependencies
Ray 2.3.1
Reproduction script
Running this script:
https://github.com/ray-project/ray/blob/master/rllib/examples/custom_tf_policy.py
Issue Severity
Low: It annoys or frustrates me.