pfnet / pfrl

PFRL: a PyTorch-based deep reinforcement learning library
MIT License
1.18k stars 158 forks source link

loading optimizer parameters #76

Closed tkelestemur closed 4 years ago

tkelestemur commented 4 years ago

I'm working on a task-specific curriculum learning for RL. Basically, I train an PPO agent for a simple task, and using the saved weights, I train on a harder task. Before I start the training for the second task, I use agent.load() function to load network parameters. I realized that agent.load() function also loads the parameters for the optimizer.

The problem is that I use learning rate decay for the first task. So if I the load last saved parameters, the optimizer would get a learning rate close to zero. Did I understand this correctly? If this is the case, the saved_attributes should be a parameter when we're creating the PPO agents.

Edit: below is a snippet from my training script:

def train(args):
    pfrl.utils.set_random_seed(args.seed)
    env = create_multi_env(args)
    args.num_actions = env.action_space.n
    args.obs_channels = env.observation_space.shape[0]
    print('Environment observation space: {}'.format(env.observation_space.shape))
    print('Environment action space     : {}'.format(env.action_space.n))

    model, opt = create_model(args)

    agent = create_agent(model, opt)

    if args.preload:
        preload_dir = os.path.join(args.outdir, args.preload, '20000000_finish')
        print('Loading pre-trained weights from {}...'.format(preload_dir))
        agent.load(preload_dir)

    args.outdir = os.path.join(args.outdir, args.model_name)

    def lr_setter(env, agent, value):
        for param_group in agent.optimizer.param_groups:
            param_group["lr"] = value

    step_hooks = [experiments.LinearInterpolationHook(args.steps, args.lr, 0, lr_setter)]

    print('Starting training...')
    experiments.train_agent_batch_with_evaluation(
        agent=agent,
        env=env,
        outdir=args.outdir,
        steps=args.steps,
        eval_n_steps=None,
        eval_n_episodes=args.eval_num_runs,
        eval_interval=args.eval_interval,
        checkpoint_freq=args.checkpoint_freq,
        log_interval=args.log_interval,
        save_best_so_far_agent=True,
        step_hooks=step_hooks,
    )
muupan commented 4 years ago

It is true that agent.load loads both model and optimizer parameters. If you want to load model parameters only, you can directly use torch.load for model and pass it to your agent instead of calling agent.load.

tkelestemur commented 4 years ago

Oh yeah that makes sense. Thanks!