ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.29k stars 5.63k forks source link

[RLlib] DQN and framework=tf / issue with latest tensorflow 2.16.1 #44676

Open karstenddwx opened 5 months ago

karstenddwx commented 5 months ago

What happened + What you expected to happen

simple rllib examples not running anymore with latest tensorflow 2.16.x

ray::DQN.init() (pid=2269076, ip=10.210.234.45, actor_id=03b7a7a9a2bf975ba84dc4f301000000, repr=DQN) File "/home/user/venv_new/lib64/python3.9/site-packages/ray/rllib/utils/deprecation.py", line 109, in patched_init return obj_init(*args, **kwargs) File "/home/user/venv_new/lib64/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 533, in init super().init( File "/home/user/venv_new/lib64/python3.9/site-packages/ray/tune/trainable/trainable.py", line 161, in init self.setup(copy.deepcopy(self.config)) File "/home/user/venv_new/lib64/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 631, in setup self.workers = WorkerSet( File "/home/user/venv_new/lib64/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 181, in init raise e.args[0].args[2] ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:

x = Input(...)
...
tf_fn(x)  # Invalid.

What you should do instead is wrap tf_fn in a layer:

class MyLayer(Layer):
    def call(self, x):
        return tf_fn(x)

x = MyLayer()(x)

also see https://discuss.tensorflow.org/t/converting-keras-tensors-to-tensorflow-tensors/23309

Versions / Dependencies

ray 2.10.0 tensorflow 2.16.x python 3.9

Reproduction script

python rllib/examples/custom_env.py --run DQN --framework tf

Issue Severity

Medium: It is a significant difficulty but I can work around it.

RocketRider commented 3 months ago

Probably duplicate: https://github.com/ray-project/ray/issues/45821

lo-zed commented 2 months ago

I have the same problem. I tried setting keras to legacy mode as suggested, with the environment variable TF_KERAS_LEGACY_MODE=1 but that didn't help.

RocketRider commented 2 months ago

This should work but it needs to be set before starting ray or the python import of Keras

lo-zed commented 2 months ago

I tried both with os.environ on the first line of the script and setting the variable in bash with no success.