Closed fardinabbasi closed 1 year ago
Hmm, something is wrong with the env registration lambda. I think somewhere you provide a env creator function that has no input arguments.
Maybe here?
train_env = lambda: RankingEnv,
However RLlib always passes in the config.env_config
dict when it calls the registered env creator, which is why you are getting this error.
Changing your code to the following should help:
train_env = lambda env_config: RankingEnv,
I'm closing this issue for now. Feel free to re-open should you still have problems with your example after fixing your custom env creator function.
Hmm, something is wrong with the env registration lambda. I think somewhere you provide a env creator function that has no input arguments.
Maybe here?
train_env = lambda: RankingEnv,
However RLlib always passes in the
config.env_config
dict when it calls the registered env creator, which is why you are getting this error.Changing your code to the following should help:
train_env = lambda env_config: RankingEnv,
Thank you for your prompt reply. I have made the changes as per your suggestions:
drl_agent = DRLlibv2(
trainable="PPO",
train_env = lambda env_config: RankingEnv,
run_name = "PPO_TRAIN",
local_dir = "/content/PPO_TRAIN",
params = train_config.to_dict(),
num_samples = 1,#Number of samples of hyperparameters config to run
training_iterations=5,
checkpoint_freq=5,
# scheduler_=scheduler_,
search_alg=search_alg,
metric = "episode_reward_mean",
mode = "max"
# callbacks=[wandb_callback]
)
However, I am still encountering the same warnings and experiencing failures. My primary goal is to pass my custom environment class named RankingEnv to the DRLlibv2 class so that I can run it with Ray Tune. In the train_tune_model function of DRLlibv2, I register my environment using register_env and then pass its name to tune.Tuner:
register_env(self.params['env'], lambda env_config: self.train_env(env_config))
I also attempted to register it like this, but it did not resolve the issue:
register_env(self.params['env'], self.train_env)
I would appreciate any further guidance or insights you can provide to help me resolve this issue. Thank you.
I have also encountered the issue with your error report. Have you resolved it? WARNING algorithm_config.py:2578 -- Setting exploration_config={} because you set _enable_rl_module_api=True. When RLModule API are enabled, exploration_config can not be set. If you want to implement custom exploration behaviour, please modify the forward_exploration method of the RLModule at hand. On configs that have a default exploration config, this must be done with config.exploration_config={}.
I have also encountered the issue with your error report. Have you resolved it? WARNING algorithm_config.py:2578 -- Setting exploration_config={} because you set _enable_rl_module_api=True. When RLModule API are enabled, exploration_config can not be set. If you want to implement custom exploration behaviour, please modify the forward_exploration method of the RLModule at hand. On configs that have a default exploration config, this must be done with config.exploration_config={}.
Unfortunately I still have this issue, please let me know if you find a solution.
我也遇到了您的错误报告的问题。你解决了吗?警告algorithm_config.py:2578 - 设置exploration_config = {},因为您设置了_enable_rl_module_api = True。当RLModule API启用时,exploration_config不能被设置。如果您想实现自定义探索行为,请修改当前 RLModule 的forward_exploration 方法。在具有默认探索配置的配置上,必须使用 config.exploration_config={} 来完成此操作。
不幸的是,我仍然遇到这个问题,如果您找到解决方案,请告诉我。
Okay, thank you! I am wondering if it is due to a problem with the Ray version, which I am using with version 2.7.0. Would it change the current issue if we downgrade Ray's version to 2.5. x.
我也遇到了您的错误报告的问题。你解决了吗?警告algorithm_config.py:2578 - 设置exploration_config = {},因为您设置了_enable_rl_module_api = True。当RLModule API启用时,exploration_config不能被设置。如果您想实现自定义探索行为,请修改当前 RLModule 的forward_exploration 方法。在具有默认探索配置的配置上,必须使用 config.exploration_config={} 来完成此操作。
不幸的是,我仍然遇到这个问题,如果您找到解决方案,请告诉我。
Okay, thank you! I am wondering if it is due to a problem with the Ray version, which I am using with version 2.7.0. Would it change the current issue if we downgrade Ray's version to 2.5. x.
It seems related to #40205
What happened + What you expected to happen
I am trying to train a ppo agent using ray.tune, but I'm getting many warnings that leads to agent dies
Versions / Dependencies
Reproduction script
(pid=5706) /usr/local/lib/python3.10/dist-packages/tensorflow_probability/python/init.py:57: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. (pid=5706) if (distutils.version.LooseVersion(tf.version) < (pid=5706) DeprecationWarning:", line 186, in
(PPO pid=5706) TypeError: () takes 0 positional arguments but 1 was given
(PPO pid=5706) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.init() (pid=5706, ip=172.28.0.12, actor_id=67b96a420d1057972a8a600601000000, repr=PPO)
(PPO pid=5706) File "/usr/local/lib/python3.10/dist-packages/ray/rllib/algorithms/algorithm.py", line 517, in init
(PPO pid=5706) super().init(
(PPO pid=5706) File "/usr/local/lib/python3.10/dist-packages/ray/tune/trainable/trainable.py", line 185, in init
(PPO pid=5706) self.setup(copy.deepcopy(self.config))
(PPO pid=5706) File "/usr/local/lib/python3.10/dist-packages/ray/rllib/algorithms/algorithm.py", line 639, in setup
(PPO pid=5706) self.workers = WorkerSet(
(PPO pid=5706) File "/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/worker_set.py", line 179, in init
(PPO pid=5706) raise e.args[0].args[2]
(PPO pid=5706) File "/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/rollout_worker.py", line 397, in init
(PPO pid=5706) self.env = env_creator(copy.deepcopy(self.env_context))
(PPO pid=5706) File "", line 186, in
(PPO pid=5706) TypeError: () takes 0 positional arguments but 1 was given
(RolloutWorker pid=5784) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=5784, ip=172.28.0.12, actor_id=87f2e00c598d2835416acdce01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7c93458ee920>)
2023-09-19 15:57:41,292 ERROR tune_controller.py:1502 -- Trial task failed for trial PPO_RankingEnv_train_c5d87e81
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
result = ray.get(future)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
return fn(*args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2549, in get
raise value
File "python/ray/_raylet.pyx", line 1999, in ray._raylet.task_execution_handler
File "python/ray/_raylet.pyx", line 1894, in ray._raylet.execute_task_with_cancellation_handler
File "python/ray/_raylet.pyx", line 1558, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 1559, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 1791, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 910, in ray._raylet.store_task_errors
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::PPO.init() (pid=5706, ip=172.28.0.12, actor_id=67b96a420d1057972a8a600601000000, repr=PPO)
File "/usr/local/lib/python3.10/dist-packages/ray/rllib/algorithms/algorithm.py", line 517, in init
super().init(
File "/usr/local/lib/python3.10/dist-packages/ray/tune/trainable/trainable.py", line 185, in init
self.setup(copy.deepcopy(self.config))
File "/usr/local/lib/python3.10/dist-packages/ray/rllib/algorithms/algorithm.py", line 639, in setup
self.workers = WorkerSet(
File "/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/worker_set.py", line 179, in init
raise e.args[0].args[2]
File "/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/rollout_worker.py", line 397, in init
self.env = env_creator(copy.deepcopy(self.env_context))
File "", line 186, in
TypeError: () takes 0 positional arguments but 1 was given
2023-09-19 15:57:41,357 WARNING experiment_state.py:371 -- Experiment checkpoint syncing has been triggered multiple times in the last 30.0 seconds. A sync will be triggered whenever a trial has checkpointed more than
DirectStepOptimizer
has been deprecated. This will raise an error in the future! (pid=5706) /usr/local/lib/python3.10/dist-packages/google/rpc/init.py:20: DeprecationWarning: Deprecated call topkg_resources.declare_namespace('google.rpc')
. (pid=5706) Implementing implicit namespace packages (as specified in PEP 420) is preferred topkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages (pid=5706) pkg_resources.declare_namespace(name) (pid=5706) /usr/local/lib/python3.10/dist-packages/pkg_resources/init.py:2349: DeprecationWarning: Deprecated call topkg_resources.declare_namespace('google')
. (pid=5706) Implementing implicit namespace packages (as specified in PEP 420) is preferred topkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages (pid=5706) declare_namespace(parent) (PPO pid=5706) 2023-09-19 15:57:32,398 WARNING algorithm_config.py:2578 -- Settingexploration_config={}
because you set_enable_rl_module_api=True
. When RLModule API are enabled, exploration_config can not be set. If you want to implement custom exploration behaviour, please modify theforward_exploration
method of the RLModule at hand. On configs that have a default exploration config, this must be done withconfig.exploration_config={}
. (PPO pid=5706) 2023-09-19 15:57:32,399 WARNING algorithm_config.py:672 -- Cannot create PPOConfig from givenconfig_dict
! Property __stdout_file__ not supported. (pid=5784) /usr/local/lib/python3.10/dist-packages/tensorflow_probability/python/init.py:57: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. (pid=5784) if (distutils.version.LooseVersion(tf.version) < (pid=5784) DeprecationWarning:DirectStepOptimizer
has been deprecated. This will raise an error in the future! (pid=5784) /usr/local/lib/python3.10/dist-packages/google/rpc/init.py:20: DeprecationWarning: Deprecated call topkg_resources.declare_namespace('google.rpc')
. (pid=5784) Implementing implicit namespace packages (as specified in PEP 420) is preferred topkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages (pid=5784) pkg_resources.declare_namespace(name) (pid=5784) /usr/local/lib/python3.10/dist-packages/pkg_resources/init.py:2349: DeprecationWarning: Deprecated call topkg_resources.declare_namespace('google')
. (pid=5784) Implementing implicit namespace packages (as specified in PEP 420) is preferred topkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages (pid=5784) declare_namespace(parent) (PPO pid=5706) 2023-09-19 15:57:41,273 ERROR actor_manager.py:500 -- Ray error, taking actor 1 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=5784, ip=172.28.0.12, actor_id=87f2e00c598d2835416acdce01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7c93458ee920>) (PPO pid=5706) File "/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/rollout_worker.py", line 397, in init (PPO pid=5706) self.env = env_creator(copy.deepcopy(self.env_context)) (PPO pid=5706) File "num_to_keep
times since last sync or if 300 seconds have passed since last sync. If you have setnum_to_keep
in yourCheckpointConfig
, consider increasing the checkpoint frequency or keeping more checkpoints. You can supress this warning by changing theTUNE_WARN_EXCESSIVE_EXPERIMENT_CHECKPOINT_SYNC_THRESHOLD_S
environment variable. 2023-09-19 15:57:41,368 ERROR tune.py:1139 -- Trials did not complete: [PPO_RankingEnv_train_c5d87e81] 2023-09-19 15:57:41,375 WARNING experiment_analysis.py:205 -- Failed to fetch metrics for 1 trial(s):Trial PPO_RankingEnv_train_c5d87e81 errored after 0 iterations at 2023-09-19 15:57:41. Total running time: 20s Error file: /root/ray_results/PPO_TRAIN/PPO_RankingEnv_train_c5d87e81_1_type=StochasticSampling,disable_action_flattening=False,disable_execution_plan_api=True,disable_in_2023-09-19_15-57-21/error.txt
Issue Severity
High: It blocks me from completing my task.