[Tune] Air output metrics are replaced by irrelevant inferred metrics #45547

Open flash-freezing-lava opened 1 month ago

flash-freezing-lava commented 1 month ago

What happened + What you expected to happen

For the below script using Tune with RLlib, the output changed (from version ~=2.9.0), and now contains irrelevant columns like num_healthy_workers instead of the default coumns like episide_reward_mean.

│ Trial name                    status       iter     total time (s)      ts     num_healthy_workers     ...flight_async_reqs     ...e_worker_restarts     ...ent_steps_sampled │
│ PPO_CartPole-v1_39c7b_00000   RUNNING         8            31.3684   32000                       1                        0                        0                    32000 │

Expected output was like:

│ Trial name                    status       iter     total time (s)      ts     reward │
│ PPO_CartPole-v1_12104_00000   RUNNING         8            33.1398   32000     223.32 │

The cause seems to be that

  1. episode_reward_mean is now returned as env_runners/episode_reward_mean only, so it is not found when checking DEFAULT_COLUMNS and therefore not in the output
  2. _infer_user_metrics guesses the irrelevant metrics

Versions / Dependencies

Ray: 2.23.0 Python: 3.11.9 OS: Arch Linux

Output of pip list:

Reproduction script

import ray
from ray import air, tune
from ray.rllib.algorithms import PPOConfig


config: PPOConfig = (
    .environment("CartPole-v1", is_atari=False)
    .env_runners(num_env_runners=1, num_envs_per_env_runner=1, num_cpus_per_env_runner=1)

tuner = tune.Tuner(
    run_config=air.RunConfig(stop={"timesteps_total": 1_500_000}),
results = tuner.fit()


Issue Severity

Low: It annoys or frustrates me.

brieyla1 commented 1 month ago

any update on this ?

It seems like we're also blocked from using certain metrics, including the episode_reward_mean inside PB2 and PB, anyone having the same issue?

justinvyu commented 2 weeks ago

@sven1977 Any ideas what changed in the logged metrics here?