ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.88k stars 5.76k forks source link

[RLlib] Can't use environment "ALE/Tetris-v5" with rllib #26432

Closed darkcoder2000 closed 2 years ago

darkcoder2000 commented 2 years ago

What happened + What you expected to happen

Hello, I am having trouble using the 'ALE/Tetris-v5' env with rllib.

When starting training I am getting an error saying

(RolloutWorker pid=8800)   File "C:\Users\SESA59632\Anaconda3\envs\RL_GymStableRay\lib\site-packages\ray\rllib\env\wrappers\atari_wrappers.py", line 100, in reset
(RolloutWorker pid=8800)     noops = self.unwrapped.np_random.integers(1, self.noop_max + 1)
(RolloutWorker pid=8800) AttributeError: 'numpy.random.mtrand.RandomState' object has no attribute 'integers'

However, I can use the same Tetris environment in stablebaselines which I use in the same virtual python env. So the Tetris evironment is working in general.

Versions / Dependencies

python 3.7.13 ale-py 0.7.4 gym 0.21.0 ray 1.12.1 torch 1.10.2+cu113 numpy 1.21.6

Reproduction script

config = {
    # Environment (RLlib understands openAI gym registered strings).
    "env": "ALE/Tetris-v5",
    "num_gpus": 1,
    "num_workers": 1,
    "framework": "torch",
    "log_level": "INFO",
    "evaluation_num_workers": 1,
    # Only for evaluation runs, render the env.
    "evaluation_config": {
        "render_env": True,
    },
}
  # How many time steps to run the experiment for.
  time_steps_total = 5_000

  # Run the experiment.
  results = tune.run(
      #agents.ppo.PPOTrainer,
      A3CTrainer,
      config=config,
      metric="episode_reward_mean",
      mode="max",
      stop={"timesteps_total": time_steps_total},
      checkpoint_at_end=True,
      checkpoint_freq=10,
      local_dir=local_dir,
  )

Issue Severity

Low: It annoys or frustrates me.

darkcoder2000 commented 2 years ago

After installing everything from scratch using python 3.8 in a new virtual environment the issue is gone.