rlworkgroup / garage

A toolkit for reproducible reinforcement learning research.
MIT License
1.88k stars 310 forks source link

Cannot plot during training #697

Closed xht033 closed 5 years ago

xht033 commented 5 years ago
from garage.experiment import LocalRunner, run_experiment
from garage.np.baselines import LinearFeatureBaseline
from garage.tf.algos import TRPO
from garage.tf.envs import TfEnv
from garage.tf.policies import CategoricalMLPPolicy

def run_task(*_):
    with LocalRunner() as runner:
        env = TfEnv(env_name='CartPole-v1')

        policy = CategoricalMLPPolicy(
            name='policy', env_spec=env.spec, hidden_sizes=(32, 32))

        baseline = LinearFeatureBaseline(env_spec=env.spec)

        algo = TRPO(
            env_spec=env.spec,
            policy=policy,
            baseline=baseline,
            max_path_length=100,
            discount=0.99,
            max_kl_step=0.01)

        runner.setup(algo, env)
        runner.train(n_epochs=100, batch_size=4000,plot=True)

run_experiment(
    run_task,
    snapshot_mode='last',
    seed=4,
    n_parallel=4,
    plot=True,
    use_tf=False,
    use_gpu=False
)

########################################################## 3) Why we removed viskit? I cannot find it in garage.

Thanks! I really like rllab and garage.

ryanjulian commented 5 years ago

Please file different issues for different problems. This isn't a conversation forum, after all, this is a tool for developing software. I have forked your second issue into #698

viskit can be found at https://github.com/rlworkgroup/viskit or by installing using pip install viskit

Can you please provide a specific launcher script which reproduces the problem you're describing (cannot plot during training)?

xht033 commented 5 years ago

I tried these scripts in garage/example/tf/: trpo_cartpole.py, trpo_gym_cartpole.py, vpg_cartpole.py. I set runner.train(n_epochs=100, batch_size=4000,plot=True) and

    run_experiment(
    run_task,
    snapshot_mode='last',
    seed=1,
    plot=True
    )

but all of these scripts cannot plot during training.

ryanjulian commented 5 years ago

I was able to confirm this bug.

For a hotfix, you can add self.plot = plot at the top of LocalRunner._train():

    def _train(self,
               n_epochs,
               n_epoch_cycles,
               batch_size,
               plot,
               store_paths,
               pause_for_plot,
               start_epoch=0):
        """Start actual training.

        Args:
            n_epochs(int): Number of epochs.
            n_epoch_cycles(int): Number of batches of samples in each epoch.
                This is only useful for off-policy algorithm.
                For on-policy algorithm this value should always be 1.
            batch_size(int): Number of steps in batch.
            plot(bool): Visualize policy by doing rollout after each epoch.
            store_paths(bool): Save paths in snapshot.
            pause_for_plot(bool): Pause for plot.
            start_epoch: (internal) The starting epoch.
                Use for experiment resuming.

        Returns:
            The average return in last epoch cycle.

        """
        assert self.has_setup, ('Use Runner.setup() to setup runner before '
                                'training.')

        # Save arguments for restore
        self.train_args = SimpleNamespace(
            n_epochs=n_epochs,
            n_epoch_cycles=n_epoch_cycles,
            batch_size=batch_size,
            plot=plot,
            store_paths=store_paths,
            pause_for_plot=pause_for_plot,
            start_epoch=start_epoch)

        self.plot = plot  # here
        self.start_worker()

@zequnyu @naeioi can you send in a fix?

zequnyu commented 5 years ago

Sure, I’ll do that.