Open rubenjacob opened 5 months ago
Hi @rubenjacob, I faced the same issue with PPO. So far I solved downgrading tensorflow version from 2.16 to 2.15. Probably something in the compatibility with Ray went wrong with the latest tensorflow update.
I run your reproduction script and it ends without errors. Moreover, printing policy.cur_lr I got "<tf.Variable 'lr:0' shape=() dtype=float32, numpy=0.0005>".
All my tests are done with Python 3.10, but I think it should work even with 3.11.9.
Hi @LorenzoMattia thanks for your reply. I know that TF <= 2.15 works. I was trying to update my code to Tensorflow 2.16 and Keras 3 but I guess that isn't fully supported yet.
I created a pull request to fix the issue.
What happened + What you expected to happen
Initializing
ImpalaTF2Policy
currently throws a ValueError sinceself.cur_lr
is a tf.Variable but the optimizer class only takes floats, LearningRateSchedules or callables.Versions / Dependencies
Ray == 2.10.0 Python == 3.11.9 OS == Win10 Tensorflow == 2.16.1
Reproduction script
Issue Severity
High: It blocks me from completing my task.