ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32.28k stars 5.5k forks source link

[RLlib] Getting Started first example executed with TF2, is followed by an error #45821

Open Deonixlive opened 1 month ago

Deonixlive commented 1 month ago

What happened + What you expected to happen

I tried the getting started commands at https://docs.ray.io/en/latest/rllib/rllib-training.html With pip install tensorflow[and-cuda] followed by pip install "ray[rllib]".

Then I tried the example: rllib train --algo DQN --env CartPole-v1 --framework tf2 --stop '{"training_iteration": 30}' This is followed by an ValueError instead of a saved checkpoint with a trained model.

(DQN pid=210953) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::DQN.init() (pid=210953, ip=192.168.1.207, actor_id=94093f51e79110d273a302e501000000, repr=DQN) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/algorithm.py", line 554, in init (DQN pid=210953) super().init( (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/tune/trainable/trainable.py", line 158, in init (DQN pid=210953) self.setup(copy.deepcopy(self.config)) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/algorithm.py", line 640, in setup (DQN pid=210953) self.workers = EnvRunnerGroup( (DQN pid=210953) ^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py", line 169, in init (DQN pid=210953) self._setup( (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py", line 260, in _setup (DQN pid=210953) self._local_worker = self._make_worker( (DQN pid=210953) ^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py", line 1108, in _make_worker (DQN pid=210953) worker = cls( (DQN pid=210953) ^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 532, in init (DQN pid=210953) self._update_policy_map(policy_dict=self.policy_dict) (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1737, in _update_policy_map (DQN pid=210953) self._build_policy_map( (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1848, in _build_policy_map (DQN pid=210953) new_policy = create_policy_for_framework( (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework (DQN pid=210953) return policy_class(observation_space, action_space, merged_config) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init (DQN pid=210953) super(TracedEagerPolicy, self).init(*args, *kwargs) (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/policy/eager_tf_policy.py", line 429, in init (DQN pid=210953) self.model = make_model(self, observation_space, action_space, config) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/dqn/dqn_tf_policy.py", line 181, in build_q_model (DQN pid=210953) q_model = ModelCatalog.get_model_v2( (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 (DQN pid=210953) return wrapper( (DQN pid=210953) ^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/dqn/distributional_q_tf_model.py", line 165, in init (DQN pid=210953) q_out = build_action_value(name + "/action_value/", self.model_out) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/dqn/distributional_q_tf_model.py", line 135, in build_action_value (DQN pid=210953) logits = tf.expand_dims(tf.ones_like(action_scores), -1) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/tensorflow/python/ops/weak_tensor_ops.py", line 88, in wrapper (DQN pid=210953) return op(args, **kwargs) (DQN pid=210953) ^^^^^^^^^^^^^^^^^^^ (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler (DQN pid=210953) raise e.with_traceback(filtered_tb) from None (DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in __tf_tensor__ (DQN pid=210953) raise ValueError( (DQN pid=210953) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: (DQN pid=210953) (DQN pid=210953) (DQN pid=210953) x = Input(...) (DQN pid=210953) ... (DQN pid=210953) tf_fn(x) # Invalid. (DQN pid=210953) (DQN pid=210953) (DQN pid=210953) What you should do instead is wrap tf_fn in a layer: (DQN pid=210953) (DQN pid=210953) (DQN pid=210953) class MyLayer(Layer): (DQN pid=210953) def call(self, x): (DQN pid=210953) return tf_fn(x) (DQN pid=210953) (DQN pid=210953) x = MyLayer()(x) (DQN pid=210953)

Versions / Dependencies

Reproduction script

pip install tensorflow[and-cuda] pip install "ray[rllib]"

rllib train --algo DQN --env CartPole-v1 --framework tf2 --stop '{"training_iteration": 30}'

Issue Severity

Medium: It is a significant difficulty but I can work around it.

RocketRider commented 3 weeks ago

This can probably be fixed easily. Looks similar to this: https://github.com/ray-project/ray/pull/45562 Currently it only works if we set keras 3 to legacy mode.

Niqnil commented 1 week ago

I think I ran into the same error.

Versions / Dependencies Ubuntu 22.04.4 Python 3.10.12 Tensorflow 2.17.0 Ray 2.32.0

import gymnasium as gym
import ray
from ray import tune
from ray.rllib.algorithms.ppo import PPOConfig

class CartPoleEnv(gym.Env):
    def __init__(self, config):
        self.env = gym.make("CartPole-v1")
        self.action_space = self.env.action_space
        self.observation_space = self.env.observation_space

    def reset(self):
        return self.env.reset()

    def step(self, action):
        return self.env.step(action)

ray.init()

config = (
    PPOConfig()
    .environment(CartPoleEnv)
    .framework("tf2")
    .training(model={"use_lstm": True})
)

tune.run(
    "PPO",
    config=config.to_dict(),
    stop={"training_iteration": 1},
)

ray.shutdown()

Error

2024-07-18 18:28:56,348 INFO worker.py:1779 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 /home/user/Code/proj/.venv/lib/python3.10/site-packages/gymnasium/spaces/box.py:130: UserWarning: WARN: Box bound precision lowered by casting to float32 gym.logger.warn(f"Box bound precision lowered by casting to {self.dtype}") /home/user/Code/proj/.venv/lib/python3.10/site-packages/gymnasium/utils/passive_env_checker.py:164: UserWarning: WARN: The obs returned by the reset() method was expecting numpy array dtype to be float32, actual type: float64 logger.warn( /home/user/Code/proj/.venv/lib/python3.10/site-packages/gymnasium/utils/passive_env_checker.py:188: UserWarning: WARN: The obs returned by the reset() method is not within the observation space. logger.warn(f"{pre} is not within the observation space.") ╭────────────────────────────────────────────────────────────╮ │ Configuration for experiment PPO_2024-07-18_18-28-57 │ ├────────────────────────────────────────────────────────────┤ │ Search algorithm BasicVariantGenerator │ │ Scheduler FIFOScheduler │ │ Number of trials 1 │ ╰────────────────────────────────────────────────────────────╯

View detailed results here: /home/user/ray_results/PPO_2024-07-18_18-28-57 To visualize your results with TensorBoard, run: tensorboard --logdir /tmp/ray/session_2024-07-18_18-28-54_522281_28467/artifacts/2024-07-18_18-28-57/PPO_2024-07-18_18-28-57/driver_artifacts 2024-07-18 18:28:57,598 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_env_runner instead. This will raise an error in the future! 2024-07-18 18:28:57,598 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_env_runner instead. This will raise an error in the future! 2024-07-18 18:28:57,599 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_learner_workers has been deprecated. Use AlgorithmConfig.num_learners instead. This will raise an error in the future! 2024-07-18 18:28:57,599 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_learner instead. This will raise an error in the future! 2024-07-18 18:28:57,599 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_learner instead. This will raise an error in the future!

Trial status: 1 PENDING Current time: 2024-07-18 18:28:57. Total running time: 0s Logical resource usage: 0/16 CPUs, 0/0 GPUs ╭────────────────────────────────────────╮ │ Trial name status │ ├────────────────────────────────────────┤ │ PPO_CartPoleEnv_8a5aa_00000 PENDING │ ╰────────────────────────────────────────╯ (PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_env_runner instead. This will raise an error in the future! (PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_env_runner instead. This will raise an error in the future! (PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_learner_workers has been deprecated. Use AlgorithmConfig.num_learners instead. This will raise an error in the future! (PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_learner instead. This will raise an error in the future! (PPO pid=29645) 2024-07-18 18:29:00,593 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_learner instead. This will raise an error in the future! 2024-07-18 18:29:04,542 ERROR tune_controller.py:1331 -- Trial task failed for trial PPO_CartPoleEnv_8a5aa_00000 Traceback (most recent call last): File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future result = ray.get(future) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper return fn(*args, kwargs) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, *kwargs) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 2656, in get values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 873, in get_objects raise value ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 241, in _setup self.add_workers( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 801, in add_workers raise result.get() File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 500, in _fetch_result result = ray.get(ready) ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init self._update_policy_map(policy_dict=self.policy_dict) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map self._build_policy_map( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map new_policy = create_policy_for_framework( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework return policy_class(observation_space, action_space, merged_config) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init super(TracedEagerPolicy, self).init(args, kwargs) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init base.init( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init self.model = self.make_model() File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model return ModelCatalog.get_model_v2( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 return wrapper( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init mask=tf.sequence_mask(seq_in), File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in __tf_tensor__ raise ValueError( ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:

x = Input(...)
...
tf_fn(x)  # Invalid.

What you should do instead is wrap tf_fn in a layer:

class MyLayer(Layer):
    def call(self, x):
        return tf_fn(x)

x = MyLayer()(x)

During handling of the above exception, another exception occurred:

ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 532, in init super().init( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 158, in init self.setup(copy.deepcopy(self.config)) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 618, in setup self.workers = EnvRunnerGroup( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 193, in init raise e.args[0].args[2] ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:

x = Input(...)
...
tf_fn(x)  # Invalid.

What you should do instead is wrap tf_fn in a layer:

class MyLayer(Layer):
    def call(self, x):
        return tf_fn(x)

x = MyLayer()(x)

Trial PPO_CartPoleEnv_8a5aa_00000 errored after 0 iterations at 2024-07-18 18:29:04. Total running time: 6s Error file: /tmp/ray/session_2024-07-18_18-28-54_522281_28467/artifacts/2024-07-18_18-28-57/PPO_2024-07-18_18-28-57/driver_artifacts/PPO_CartPoleEnv_8a5aa_00000_0_2024-07-18_18-28-57/error.txt 2024-07-18 18:29:04,547 INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/home/user/ray_results/PPO_2024-07-18_18-28-57' in 0.0037s.

Trial status: 1 ERROR Current time: 2024-07-18 18:29:04. Total running time: 6s Logical resource usage: 0/16 CPUs, 0/0 GPUs ╭────────────────────────────────────────╮ │ Trial name status │ ├────────────────────────────────────────┤ │ PPO_CartPoleEnv_8a5aa_00000 ERROR │ ╰────────────────────────────────────────╯

Number of errored trials: 1 ╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Trial name # failures error file │ ├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ PPO_CartPoleEnv_8a5aa_00000 1 /tmp/ray/session_2024-07-18_18-28-54_522281_28467/artifacts/2024-07-18_18-28-57/PPO_2024-07-18_18-28-57/driver_artifacts/PPO_CartPoleEnv_8a5aa_00000_0_2024-07-18_18-28-57/error.txt │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/user/Code/proj/proj/rllib_ex.py", line 29, in tune.run( File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/tune/tune.py", line 1035, in run raise TuneError("Trials did not complete", incomplete_trials) ray.tune.error.TuneError: ('Trials did not complete', [PPO_CartPoleEnv_8a5aa_00000]) (PPO pid=29645) 2024-07-18 18:29:04,525 ERROR actor_manager.py:523 -- Ray error, taking actor 1 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init (PPO pid=29645) self._update_policy_map(policy_dict=self.policy_dict) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map (PPO pid=29645) self._build_policy_map( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map (PPO pid=29645) new_policy = create_policy_for_framework( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework (PPO pid=29645) return policy_class(observation_space, action_space, merged_config) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init (PPO pid=29645) super(TracedEagerPolicy, self).init(*args, kwargs) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init (PPO pid=29645) base.init( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init (PPO pid=29645) self.model = self.make_model() (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model (PPO pid=29645) return ModelCatalog.get_model_v2( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 (PPO pid=29645) return wrapper( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init (PPO pid=29645) mask=tf.sequence_mask(seq_in), (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler (PPO pid=29645) raise e.with_traceback(filtered_tb) from None (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor (PPO pid=29645) raise ValueError( (PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) What you should do instead is wrap tf_fn in a layer: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645) (PPO pid=29645) 2024-07-18 18:29:04,526 ERROR actor_manager.py:523 -- Ray error, taking actor 2 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.init__() (pid=29733, ip=192.168.1.58, actor_id=5bfd0c39db45fb7efe6bcfb501000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7aabd5b52c80>) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init (PPO pid=29645) self._update_policy_map(policy_dict=self.policy_dict) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map (PPO pid=29645) self._build_policy_map( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map (PPO pid=29645) new_policy = create_policy_for_framework( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework (PPO pid=29645) return policy_class(observation_space, action_space, merged_config) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init (PPO pid=29645) super(TracedEagerPolicy, self).init(*args, **kwargs) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init (PPO pid=29645) base.init( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init (PPO pid=29645) self.model = self.make_model() (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model (PPO pid=29645) return ModelCatalog.get_model_v2( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 (PPO pid=29645) return wrapper( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init (PPO pid=29645) mask=tf.sequence_mask(seq_in), (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler (PPO pid=29645) raise e.with_traceback(filtered_tb) from None (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in __tf_tensor (PPO pid=29645) raise ValueError( (PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) What you should do instead is wrap tf_fn in a layer: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645) (PPO pid=29645) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 241, in _setup (PPO pid=29645) self.add_workers( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 801, in add_workers (PPO pid=29645) raise result.get() (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 500, in _fetch_result (PPO pid=29645) result = ray.get(ready) (PPO pid=29645) ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init (PPO pid=29645) self._update_policy_map(policy_dict=self.policy_dict) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map (PPO pid=29645) self._build_policy_map( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map (PPO pid=29645) new_policy = create_policy_for_framework( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework (PPO pid=29645) return policy_class(observation_space, action_space, merged_config) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init (PPO pid=29645) super(TracedEagerPolicy, self).init(*args, kwargs) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init (PPO pid=29645) base.init( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init (PPO pid=29645) self.model = self.make_model() (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model (PPO pid=29645) return ModelCatalog.get_model_v2( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 (PPO pid=29645) return wrapper( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init (PPO pid=29645) mask=tf.sequence_mask(seq_in), (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler (PPO pid=29645) raise e.with_traceback(filtered_tb) from None (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor__ (PPO pid=29645) raise ValueError( (PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) What you should do instead is wrap tf_fn in a layer: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) During handling of the above exception, another exception occurred: (PPO pid=29645) (PPO pid=29645) ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 532, in init (PPO pid=29645) super().init( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 158, in init (PPO pid=29645) self.setup(copy.deepcopy(self.config)) (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 618, in setup (PPO pid=29645) self.workers = EnvRunnerGroup( (PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 193, in init (PPO pid=29645) raise e.args[0].args[2] (PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) What you should do instead is wrap tf_fn in a layer: (PPO pid=29645) (PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645) (RolloutWorker pid=29732) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>) (RolloutWorker pid=29732) (RolloutWorker pid=29732) (RolloutWorker pid=29732) (RolloutWorker pid=29732) (RolloutWorker pid=29733) (RolloutWorker pid=29733) (RolloutWorker pid=29733) (RolloutWorker pid=29733) (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init [repeated 10x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.) (RolloutWorker pid=29733) self._update_policy_map(policy_dict=self.policy_dict) [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map [repeated 2x across cluster] (RolloutWorker pid=29733) self._build_policy_map( [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map [repeated 2x across cluster] (RolloutWorker pid=29733) new_policy = create_policy_for_framework( [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework [repeated 2x across cluster] (RolloutWorker pid=29733) return policy_class(observation_space, action_space, merged_config) [repeated 2x across cluster] (RolloutWorker pid=29733) super(TracedEagerPolicy, self).init(*args, **kwargs) [repeated 2x across cluster] (RolloutWorker pid=29733) base.init( [repeated 2x across cluster] (RolloutWorker pid=29733) self.model = self.make_model() [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model [repeated 2x across cluster] (RolloutWorker pid=29733) return ModelCatalog.get_model_v2( [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 [repeated 2x across cluster] (RolloutWorker pid=29733) return wrapper( [repeated 2x across cluster] (RolloutWorker pid=29733) mask=tf.sequence_mask(seq_in), [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler [repeated 2x across cluster] (RolloutWorker pid=29733) raise e.with_traceback(filtered_tb) from None [repeated 2x across cluster] (RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor__ [repeated 2x across cluster] (RolloutWorker pid=29733) raise ValueError( [repeated 2x across cluster] (RolloutWorker pid=29733) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: [repeated 2x across cluster] (RolloutWorker pid=29733) `` [repeated 8x across cluster] (RolloutWorker pid=29733) x = Input(...) [repeated 2x across cluster] (RolloutWorker pid=29733) ... [repeated 2x across cluster] (RolloutWorker pid=29733) tf_fn(x) # Invalid. [repeated 2x across cluster] (RolloutWorker pid=29733) What you should do instead is wraptf_fn` in a layer: [repeated 2x across cluster] (RolloutWorker pid=29733) class MyLayer(Layer): [repeated 2x across cluster] (RolloutWorker pid=29733) def call(self, x): [repeated 2x across cluster] (RolloutWorker pid=29733) return tf_fn(x) [repeated 2x across cluster] (RolloutWorker pid=29733) x = MyLayer()(x) [repeated 2x across cluster] (RolloutWorker pid=29733) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29733, ip=192.168.1.58, actor_id=5bfd0c39db45fb7efe6bcfb501000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7aabd5b52c80>)