What do final_return, final_std, best_return, best_std mean?
How to calculate these results based on metrics imported in log file, e.g., evaluators={"environment": d3rlpy.metrics.EnvironmentEvaluator(env)},
Does final_return mean the final return obtained in the last epoch in one experiment or the average final return across several seeds?
What do final_return, final_std, best_return, best_std mean? How to calculate these results based on metrics imported in log file, e.g., evaluators={"environment": d3rlpy.metrics.EnvironmentEvaluator(env)}, Does final_return mean the final return obtained in the last epoch in one experiment or the average final return across several seeds?