pfnet / pfrl

PFRL: a PyTorch-based deep reinforcement learning library
MIT License
1.18k stars 157 forks source link

[Feature Request/Proposal] use MLFlow option or best practice of experiments management #94

Open Chachay opened 3 years ago

Chachay commented 3 years ago

Thank you for the excellent RL library. PFRL makes my life so much easy.

As the management of the experiment becomes complicated, I have tried PFRL with MLFlow. And I'm satisfied with the initial implementation (see code below). MLFlow helps to compare the performances of algorithms, to manage trained models and to monitor training results remotely.

On the other hand, if PFRL natively supports MLFlow, it would be even easier to use and I can expect the wisdom of various experiment management efforts from other users. At the moment, Tensorboard support was just added a few months ago, and I'm sure that each wants to use different tools, so I've listed this as an issue to discuss.

The motivations for native support:

An alternative instead of MLFlow native support:

More general question about the management..:

How to use PFRL and MLFLOW together

        existing_exp = mlflow.get_experiment_by_name(args.env)
        if not existing_exp:
            mlflow.create_experiment(args.env)
        mlflow.set_experiment(args.env)

        def log_mlflow(env, agent, evaluator, step, eval_score):
            mlflow.log_metric("R_mean", eval_score, step=step)

        try:
            with mlflow.start_run():
                mlflow.log_param("Algo", "SAC")
                mlflow.log_artifacts(args.outdir)
                mlflow.log_param("OutDir", args.outdir)

                experiments.train_agent_with_evaluation(
                        agent=agent,
                        env=make_env(0, False),
                        eval_env=make_env(0, True),
                        outdir=args.outdir,
                        steps=args.steps,
                        eval_n_steps=None,
                        eval_n_episodes=args.eval_n_runs,
                        eval_interval=args.eval_interval,
                        save_best_so_far_agent=True,
                        evaluation_hooks=(log_mlflow,),
                )
        finally:
            mlflow.log_artifacts(args.outdir)
            mlflow.end_run()
prabhatnagarajan commented 3 years ago

Thanks! I'm glad to hear you like our library and that it makes your life easier :)

Thanks for taking the time to produce the example and outline this proposal. We'll discuss this internally and get back to you.