[rllib] Adding custom_metrics to CLIReporter() failed in IMPALA

forhonourlx commented 4 years ago

Hi Ray Team,

I am trying to add custom_metrics to CLIReporter(), but got a default CLI report. Could somebody give me a hand? Thanks in advance.

if __name__ == '__main__':
    reporter = CLIReporter()
    for name_ in env_config["return_info_name_list"]:
            if name_.startswith('ep_'):
                reporter._metric_columns[name_] = name_[3:]
                reporter._metric_columns[name_+'_mean'] = name_[3:]

    ray.tune.run(ImpalaTrainer, 
            progress_reporter=reporter,
            config={...
                "callbacks": MetricCallbacksWrapper,}

class MetricCallbacksWrapper(DefaultCallbacks):
    def on_episode_start(self, worker: RolloutWorker, base_env: BaseEnv,
                         policies: typing.Dict[str, Policy],
                         episode: MultiAgentEpisode, **kwargs):
        env = base_env.get_unwrapped()[0].env
        for name_ in env.return_info_name_list:
            if name_.startswith('ep_'):
                episode.custom_metrics[name_] = None

    def on_episode_end(self, worker: RolloutWorker, base_env: BaseEnv,
                       policies: typing.Dict[str, Policy], episode: MultiAgentEpisode,
                       **kwargs):

        info_str = ''
        info_dict = episode.last_info_for()
        for name_, v in info_dict.items():
            if name_.startswith('ep_'):
                episode.custom_metrics[name_] = v
                info_str += f'{name_}:{v:.4f}\n'
        print(f'on_episode_end:\n{info_str}')

Result:

(pid=5392) on_episode_end:ep_total_return_rate:0.0019
(pid=5392) ep_order_count:6.0000
(pid=5392) ep_win_rate:0.6000
(pid=5392) ep_order_entry_velocity:0.0102
(pid=5392) ep_long_to_total_ratio:0.5000
(pid=5392) ep_avg_MFE:0.0008
(pid=5392) ep_sharpe_ratio:-22.3391

Result for IMPALA_FxEnv_b6510_00000:
  custom_metrics:
    ep_avg_MFE_max: 0.001540000000000023
    ep_avg_MFE_mean: 0.0006416455978013039
    ep_avg_MFE_min: 0.00016333333333329314
    ep_long_to_total_ratio_max: 0.875
    ep_long_to_total_ratio_mean: 0.5310642971395441
    ep_long_to_total_ratio_min: 0.16666666666666666
    ep_order_count_max: 184
    ep_order_count_mean: 28.634146341463413
    ep_order_count_min: 4
    ep_order_entry_velocity_max: 0.020698051948051948
    ep_order_entry_velocity_mean: 0.012475068898716472
    ep_order_entry_velocity_min: 0.006825938566552901
    ep_sharpe_ratio_max: 49.25959419484006
    ep_sharpe_ratio_mean: -16.578713693954487
    ep_sharpe_ratio_min: -61.33495635798412
    ep_total_return_rate_max: 0.06949751584552555
    ep_total_return_rate_mean: -0.04765599001411366
    ep_total_return_rate_min: -0.2038292187034334
    ep_win_rate_max: 1.0
    ep_win_rate_mean: 0.46346413652680657
    ep_win_rate_min: 0.0
  date: 2020-06-07_10-46-46

......

  time_since_restore: 124.94992399215698
  time_this_iter_s: 8.119489192962646
  time_total_s: 124.94992399215698
  timers:
    sample_throughput: 19858.857
    sample_time_ms: 25.178
  timestamp: 1591498006
  timesteps_since_restore: 0
  timesteps_total: 179500
  training_iteration: 10
  trial_id: b6510_00000

== Status ==
Memory usage on this node: 16.9/94.2 GiB
Using FIFO scheduling algorithm.
Resources requested: 16/16 CPUs, 1/1 GPUs, 0.0/54.59 GiB heap, 0.0/18.8 GiB objects
Result logdir: /home/simon/ray_results/IMPALA
Number of trials: 1 (1 RUNNING)
+--------------------------+----------+------------------+--------+------------------+--------+----------+
| Trial name               | status   | loc              |   iter |   total time (s) |     ts |   reward |
|--------------------------+----------+------------------+--------+------------------+--------+----------|
| IMPALA_FxEnv_b6510_00000 | RUNNING  | 192.168.1.3:5389 |     10 |           124.95 | 179500 | -48.6208 |
+--------------------------+----------+------------------+--------+------------------+--------+----------+

stale[bot] commented 3 years ago

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

stale[bot] commented 3 years ago

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

ray-project / ray

[rllib] Adding custom_metrics to CLIReporter() failed in IMPALA #8818