ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.2k stars 5.61k forks source link

[RLlib][Windows] error in custom_env.py #30525

Closed XA23i closed 1 year ago

XA23i commented 1 year ago

What happened + What you expected to happen

when i install raylib and run my first rayrl code(example "custom_env.py"), i come across an error, (rllib2) C:\Users\Raymond\Desktop\rayRL>python custom_env.py --framework torch Running with following CLI options: Namespace(as_test=False, framework='torch', local_mode=False, no_tune=False, run='PPO', stop_iters=50, stop_reward=0.1, stop_timesteps=100000) 2022-11-21 20:41:05,424 INFO worker.py:1528 -- Started a local Ray instance. Traceback (most recent call last): File "custom_env.py", line 161, in get_trainable_cls(args.run) AttributeError: 'dict' object has no attribute 'environment' I am very consusing .

Versions / Dependencies

install as instruction

Reproduction script

python custom_env.py

Issue Severity

No response

XA23i commented 1 year ago

seems work when use ray-3.0 Howeveer no instruction tell me that

XA23i commented 1 year ago

btw I am windows

samo133 commented 1 year ago

The same error is generated while running the custom model/ custom environment example operating system: Ubuntu 20.04 Ray version:2.1.0

samo133 commented 1 year ago

The solution that works for me: from ray.rllib.algorithms.ppo import PPOConfig config = ( PPOConfig() .environment(SimpleCorridor, env_config={"corridor_length": 5}) .framework(args.framework) .rollouts(num_rollout_workers=1) .training( model={ "custom_model": "my_model", "vf_share_layers": True, } )

Use GPUs iff RLLIB_NUM_GPUS env var set to > 0.

    .resources(num_gpus=int(os.environ.get("RLLIB_NUM_GPUS", "1")))
)
sven1977 commented 1 year ago

I can confirm it's working on master (3.0) on Windows as well as 2.1.

This script is part of our CI testing suite, so it's constantly checked.

+--------------------------------+------------+-----------------+--------+------------------+-------+----------+----------------------+----------------------+--------------------+
| Trial name                     | status     | loc             |   iter |   total time (s) |    ts |   reward |   episode_reward_max |   episode_reward_min |   episode_len_mean |
|--------------------------------+------------+-----------------+--------+------------------+-------+----------+----------------------+----------------------+--------------------|
| PPO_SimpleCorridor_fb9cf_00000 | TERMINATED | 127.0.0.1:36436 |      3 |          34.8335 | 12000 | 0.157853 |              1.59751 |             -2.09562 |            9.27972 |
+--------------------------------+------------+-----------------+--------+------------------+-------+----------+----------------------+----------------------+--------------------+