ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32.17k stars 5.48k forks source link

[Tune] Air output metrics are replaced by irrelevant inferred metrics #45547

Open flash-freezing-lava opened 1 month ago

flash-freezing-lava commented 1 month ago

What happened + What you expected to happen

For the below script using Tune with RLlib, the output changed (from version ~=2.9.0), and now contains irrelevant columns like num_healthy_workers instead of the default coumns like episide_reward_mean.

╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Trial name                    status       iter     total time (s)      ts     num_healthy_workers     ...flight_async_reqs     ...e_worker_restarts     ...ent_steps_sampled │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PPO_CartPole-v1_39c7b_00000   RUNNING         8            31.3684   32000                       1                        0                        0                    32000 │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Expected output was like:

╭───────────────────────────────────────────────────────────────────────────────────────╮
│ Trial name                    status       iter     total time (s)      ts     reward │
├───────────────────────────────────────────────────────────────────────────────────────┤
│ PPO_CartPole-v1_12104_00000   RUNNING         8            33.1398   32000     223.32 │
╰───────────────────────────────────────────────────────────────────────────────────────╯

The cause seems to be that

  1. episode_reward_mean is now returned as env_runners/episode_reward_mean only, so it is not found when checking DEFAULT_COLUMNS and therefore not in the output
  2. _infer_user_metrics guesses the irrelevant metrics

Versions / Dependencies

Ray: 2.23.0 Python: 3.11.9 OS: Arch Linux

Output of pip list:

Package                      Version
---------------------------- -----------
absl-py                      2.1.0
aiohttp                      3.9.5
aiohttp-cors                 0.7.0
aiosignal                    1.3.1
annotated-types              0.6.0
astunparse                   1.6.3
attrs                        23.2.0
box2d-py                     2.3.5
cachetools                   5.3.3
certifi                      2024.2.2
charset-normalizer           3.3.2
click                        8.1.7
cloudpickle                  3.0.0
colorful                     0.5.6
contourpy                    1.2.1
cycler                       0.12.1
decorator                    5.1.1
distlib                      0.3.8
dm-tree                      0.1.8
Farama-Notifications         0.0.4
filelock                     3.14.0
flatbuffers                  24.3.25
fonttools                    4.51.0
frozenlist                   1.4.1
fsspec                       2024.5.0
gast                         0.4.0
google-api-core              2.19.0
google-auth                  2.29.0
google-auth-oauthlib         1.0.0
google-pasta                 0.2.0
googleapis-common-protos     1.63.0
GPUtil                       1.4.0
grpcio                       1.63.0
gymnasium                    0.28.1
h5py                         3.11.0
idna                         3.7
imageio                      2.34.1
jax                          0.4.28
jax-jumpy                    1.0.0
Jinja2                       3.1.4
jsonschema                   4.22.0
jsonschema-specifications    2023.12.1
keras                        2.12.0
kiwisolver                   1.4.5
lazy_loader                  0.4
libclang                     18.1.1
linkify-it-py                2.0.3
lz4                          4.3.3
Markdown                     3.6
markdown-it-py               3.0.0
MarkupSafe                   2.1.5
matplotlib                   3.8.4
mdit-py-plugins              0.4.1
mdurl                        0.1.2
memray                       1.12.0
ml-dtypes                    0.4.0
mpmath                       1.3.0
msgpack                      1.0.8
multidict                    6.0.5
networkx                     3.3
numpy                        1.23.5
nvidia-cublas-cu12           12.1.3.1
nvidia-cuda-cupti-cu12       12.1.105
nvidia-cuda-nvrtc-cu12       12.1.105
nvidia-cuda-runtime-cu12     12.1.105
nvidia-cudnn-cu12            8.9.2.26
nvidia-cufft-cu12            11.0.2.54
nvidia-curand-cu12           10.3.2.106
nvidia-cusolver-cu12         11.4.5.107
nvidia-cusparse-cu12         12.1.0.106
nvidia-nccl-cu12             2.20.5
nvidia-nvjitlink-cu12        12.4.127
nvidia-nvtx-cu12             12.1.105
oauthlib                     3.2.2
opencensus                   0.11.4
opencensus-context           0.1.3
opt-einsum                   3.3.0
packaging                    24.0
pandas                       2.2.2
pillow                       10.3.0
pip                          24.0
platformdirs                 4.2.2
prometheus_client            0.20.0
proto-plus                   1.23.0
protobuf                     4.25.3
py-spy                       0.3.14
pyarrow                      16.1.0
pyasn1                       0.6.0
pyasn1_modules               0.4.0
pydantic                     2.7.1
pydantic_core                2.18.2
pygame                       2.1.3
Pygments                     2.18.0
pyparsing                    3.1.2
python-dateutil              2.9.0.post0
pytz                         2024.1
PyYAML                       6.0.1
ray                          2.23.0
referencing                  0.35.1
requests                     2.31.0
requests-oauthlib            2.0.0
rich                         13.7.1
rpds-py                      0.18.1
rsa                          4.9
scikit-image                 0.23.2
scipy                        1.13.0
setuptools                   65.5.0
shellingham                  1.5.4
six                          1.16.0
smart-open                   7.0.4
swig                         4.2.1
sympy                        1.12
tensorboard                  2.12.3
tensorboard-data-server      0.7.2
tensorboardX                 2.6.2.2
tensorflow                   2.12.0
tensorflow-estimator         2.12.0
tensorflow-io-gcs-filesystem 0.37.0
tensorflow-probability       0.24.0
termcolor                    2.4.0
textual                      0.60.1
tifffile                     2024.5.10
torch                        2.3.0
tqdm                         4.66.4
triton                       2.3.0
typer                        0.12.3
typing_extensions            4.11.0
tzdata                       2024.1
uc-micro-py                  1.0.3
urllib3                      2.2.1
virtualenv                   20.26.2
Werkzeug                     3.0.3
wheel                        0.43.0
wrapt                        1.14.1
yarl                         1.9.4

Reproduction script

import ray
from ray import air, tune
from ray.rllib.algorithms import PPOConfig

ray.init(local_mode=True)

config: PPOConfig = (
    PPOConfig()
    .environment("CartPole-v1", is_atari=False)
    .framework("torch")
    .env_runners(num_env_runners=1, num_envs_per_env_runner=1, num_cpus_per_env_runner=1)
    .learners(num_learners=1)
)

tuner = tune.Tuner(
    "PPO",
    param_space=config.to_dict(),
    run_config=air.RunConfig(stop={"timesteps_total": 1_500_000}),
)
results = tuner.fit()

ray.shutdown()

Issue Severity

Low: It annoys or frustrates me.

brieyla1 commented 1 month ago

any update on this ?

It seems like we're also blocked from using certain metrics, including the episode_reward_mean inside PB2 and PB, anyone having the same issue?

justinvyu commented 2 weeks ago

@sven1977 Any ideas what changed in the logged metrics here?