How to log performance metrics when evaluating a trained policy?

PierreExeter commented 4 years ago

Hello,

When evaluating the policy with

python scripts/run_policy.py LOCAL_LOG_DIR/<exp_prefix>/<foldername>/params.pkl

I noticed that there is a logger called after each rollout (logger.dump_tabular()). However this logger is never called because the self._tabular list is never appended, see this line (the condition in the if loop is never satisfied).

I would like to compute the average return (and some other custom metrics) over each episode. I know I can probably implement it myself but I was hoping that this logger would do the job for me.

How can I use this logger when evaluating a trained policy?

Thanks, Pierre

vitchyr commented 4 years ago

You need to make sure that there's a call to logger.record_tabular. If you want generic path information, I would recommend using this: https://github.com/vitchyr/rlkit/blob/f136e140a57078c4f0f665051df74dffb1351f33/rlkit/core/eval_util.py#L13

So you can add something like

for k, v in get_generic_path_information(paths).items():
  logger.record_tabular(k, v)

PierreExeter commented 4 years ago

Ok thanks a lot!

rail-berkeley / rlkit

How to log performance metrics when evaluating a trained policy? #108