[Feature Request/Proposal] use MLFlow option or best practice of experiments management

Thank you for the excellent RL library. PFRL makes my life so much easy.

As the management of the experiment becomes complicated, I have tried PFRL with MLFlow. And I'm satisfied with the initial implementation (see code below). MLFlow helps to compare the performances of algorithms, to manage trained models and to monitor training results remotely.

On the other hand, if PFRL natively supports MLFlow, it would be even easier to use and I can expect the wisdom of various experiment management efforts from other users. At the moment, Tensorboard support was just added a few months ago, and I'm sure that each wants to use different tools, so I've listed this as an issue to discuss.

The motivations for native support:

to reduce overlapped logging functions
to give more detailed evaluation score access to MLFlow

An alternative instead of MLFlow native support:

to update record_tb_stats(self.tb_writer, agent_stats, eval_stats, t) to more general implementation (like eval_hooks)

More general question about the management..:

to log the history of reward shaping and observation feature engineering on custom environments (git diff is a bit hard to read through)

How to use PFRL and MLFLOW together

        existing_exp = mlflow.get_experiment_by_name(args.env)
        if not existing_exp:
            mlflow.create_experiment(args.env)
        mlflow.set_experiment(args.env)

        def log_mlflow(env, agent, evaluator, step, eval_score):
            mlflow.log_metric("R_mean", eval_score, step=step)

        try:
            with mlflow.start_run():
                mlflow.log_param("Algo", "SAC")
                mlflow.log_artifacts(args.outdir)
                mlflow.log_param("OutDir", args.outdir)

                experiments.train_agent_with_evaluation(
                        agent=agent,
                        env=make_env(0, False),
                        eval_env=make_env(0, True),
                        outdir=args.outdir,
                        steps=args.steps,
                        eval_n_steps=None,
                        eval_n_episodes=args.eval_n_runs,
                        eval_interval=args.eval_interval,
                        save_best_so_far_agent=True,
                        evaluation_hooks=(log_mlflow,),
                )
        finally:
            mlflow.log_artifacts(args.outdir)
            mlflow.end_run()

pfnet / pfrl

[Feature Request/Proposal] use MLFlow option or best practice of experiments management #94