Open mantasu opened 3 weeks ago
This only happens when .env_runners(num_env_runners=0)
is not set, unlike in the tutorial.
@mantasu Thanks for raising this issue. It appears as if your Python module my_module
is not installed in the Python environment? As soon as you start remote workers the remote workers do not find this module - probably due to a different file search paths in the remote workers. When starting a ray cluster you either have to install the module directly into the Python enviornment or you need to use a absolute filepath.
What happened + What you expected to happen
Following the official tutorial, it is possible to specify custom
RLModule
s. However, after importing them from a custom local package, I get errors about that custom package not being found. The customRLModule
can only be found if it is defined within the same file or a relative import is used.Versions / Dependencies
rllib==2.24.0
,python=3.11
Reproduction script
File structure:
my_ppo_torch_rl_module.py
content:main.ipynb
content (single cell):Error traceback
```bash --------------------------------------------------------------------------- ActorDiedError Traceback (most recent call last) File ~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:169, in EnvRunnerGroup.__init__(self, env_creator, validate_env, default_policy_class, config, num_env_runners, local_env_runner, logdir, _setup, num_workers, local_worker) [168](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:168) try: --> [169](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:169) self._setup( [170](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:170) validate_env=validate_env, [171](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:171) config=config, [172](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:172) num_env_runners=num_env_runners, [173](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:173) local_env_runner=local_env_runner, [174](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:174) ) [175](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:175) # EnvRunnerGroup creation possibly fails, if some (remote) workers cannot [176](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:176) # be initialized properly (due to some errors in the EnvRunners's [177](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:177) # constructor). File ~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:239, in EnvRunnerGroup._setup(self, validate_env, config, num_env_runners, local_env_runner) [238](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:238) # Create a number of @ray.remote workers. --> [239](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:239) self.add_workers( [240](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:240) num_env_runners, [241](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:241) validate=config.validate_env_runners_after_construction, [242](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:242) ) [244](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:244) # If num_env_runners > 0 and we don't have an env on the local worker, [245](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:245) # get the observation- and action spaces for each policy from [246](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:246) # the first remote worker (which does have an env). File ~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:754, in EnvRunnerGroup.add_workers(self, num_workers, validate) [753](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:753) if not result.ok: --> [754](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py:754) raise result.get() File ~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py:500, in FaultTolerantActorManager._fetch_result(self, remote_actor_ids, remote_calls, tags, timeout_seconds, return_obj_refs, mark_healthy) [499](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py:499) try: --> [500](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py:500) result = ray.get(ready) [501](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/mantasu/projects/symrl/notebooks/~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py:501) remote_results.add_result(actor_id, ResultOrError(result=result), tag) File ~/programs/anaconda/envs/sddl/lib/python3.11/site-packages/ray/_private/auto_init_hook.py:21, in wrap_auto_init.Issue Severity
Medium: It is a significant difficulty but I can work around it.