ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32k stars 5.45k forks source link

[Bug] Unity3D env lacks group rewards #21489

Closed grypesc closed 2 years ago

grypesc commented 2 years ago

Search before asking

Ray Component

RLlib

What happened + What you expected to happen

Hi, there is a bug in RLlib 1.9.1 with release 18 of ml-agents. In Unity multi-policy environments, such as SoccerStrikersVsGoalie, reward that comes to the RLlib-side includes an additional group reward but RLlib doesn't use it at all. This makes multi-policy environments to be not solvable using the framework.

Reproduce: Run SoccerStrikersVsGoalie environment using ray/rllib/examples/unity3d_env_local.py script. You will notice that agents do not learn a proper policy at all, after 200 000 timesteps and tuning hyperparameters the problem remains. Mean reward in progess.csv doesn't improve.

Solution: Method _get_step_results() in class Unity3DEnv(MultiAgentEnv) in ray/rllib/env/wrappers/unity3d_env.py should include group reward. I can submit a PR, policies for SoccerStrikersVsGoalie trained after addition of group rewards work as intended.

Versions / Dependencies

Windows 10 Python3.9.9 Unity editor v.2019.4.25f1 ML Agents Release 18

Python packages: Package Version


absl-py 1.0.0 astunparse 1.6.3 attrs 21.4.0 autonomous-learning-library 0.7.2 cachetools 4.2.4 cattrs 1.5.0 certifi 2021.10.8 charset-normalizer 2.0.10 click 8.0.3 cloudpickle 1.6.0 colorama 0.4.4 cycler 0.11.0 Deprecated 1.2.13 dm-tree 0.1.6 filelock 3.4.2 flatbuffers 2.0 fonttools 4.28.5 gast 0.4.0 google-auth 2.3.3 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 grpcio 1.43.0 gym 0.18.3 h5py 3.6.0 hyperlink 21.0.0 idna 3.3 importlib-metadata 4.10.0 Jinja2 3.0.3 jsonschema 4.3.3 keras 2.7.0 Keras-Preprocessing 1.1.2 kiwisolver 1.3.2 libclang 12.0.0 lz4 3.1.10 Markdown 3.3.6 MarkupSafe 2.0.1 matplotlib 3.5.1 mlagents 0.27.0 mlagents-envs 0.27.0 msgpack 1.0.3 numpy 1.22.0 oauthlib 3.1.1 opencv-python 3.4.17.61 opt-einsum 3.3.0 packaging 21.3 pandas 1.3.5 Pillow 8.2.0 pip 21.2.4 protobuf 3.19.1 pyasn1 0.4.8 pyasn1-modules 0.2.8 pyglet 1.5.15 pyparsing 3.0.6 pypiwin32 223 pyrsistent 0.18.0 python-dateutil 2.8.2 pytz 2021.3 pywin32 303 PyYAML 6.0 ray 1.9.1

Reproduction script

ray/rllib/examples/unity3d_env_local.py

Anything else

No response

Are you willing to submit a PR?

gjoliver commented 2 years ago

would really appreciate if you can help submit a PR. thanks a ton.

sven1977 commented 2 years ago

Thanks for the PR @grypesc , this is awesome! I just pushed some minor changes to it (LINT was failing and re-pulled from our upstream master) and will merge right after confirming that all tests are passing now ...