PPO stack.pop() from empty list ?

sen-pai commented 3 years ago

Hey, thanks for this library, it is very helpful! I am implementing a simple recurrent ppo and I keep getting the following error message:

INFO:pfrl.experiments.train_agent_batch:outdir:rnn_run step:2040 episode:22 last_R: 0.0 average_R:0.25239999999999996 INFO:pfrl.experiments.train_agent_batch:statistics: [('average_value', -0.06298958), ('average_entropy', 1.9389327), ('average_value_loss', nan), ('average_policy_loss', nan), ('n_updates', 0), ('explained_variance', nan)] INFO:pfrl.experiments.train_agent_batch:Saved the agent to rnn_run/2040_except Traceback (most recent call last): File "rnn_minigrid_ppo.py", line 99, in pfrl.experiments.train_agent_batch( File "/home/sharan/Reccurent-GAIL/pfrl/pfrl/experiments/train_agent_batch.py", line 82, in train_agent_batch agent.batch_observe(obss, rs, dones, resets) File "/home/sharan/Reccurent-GAIL/pfrl/pfrl/agents/ppo.py", line 681, in batch_observe self._batch_observe_train(batch_obs, batch_reward, batch_done, batch_reset) File "/home/sharan/Reccurent-GAIL/pfrl/pfrl/agents/ppo.py", line 807, in _batch_observe_train self._update_if_dataset_is_ready() File "/home/sharan/Reccurent-GAIL/pfrl/pfrl/agents/ppo.py", line 429, in _update_if_dataset_is_ready self._update_recurrent(dataset) File "/home/sharan/Reccurent-GAIL/pfrl/pfrl/agents/ppo.py", line 628, in _update_recurrent for minibatch in _yield_subset_of_sequences_with_fixed_number_of_items( File "/home/sharan/Reccurent-GAIL/pfrl/pfrl/agents/ppo.py", line 165, in _yield_subset_of_sequences_with_fixed_number_of_items sequence = stack.pop() IndexError: pop from empty list

I have changed the batch_size, max_recurrent_sequence_len, and no of parallel envs but I consistently get this error at step: 2040 Can you help me out please?

lerrytang commented 3 years ago

Hi, I ran into a similar problem at the same place. While I'm sure PFN will come up with a solution later, you can try out this PR if you want a fast test.

muupan commented 3 years ago

If I understand the issue correctly, there should be no such error if update_interval % minibatch_size == 0 and update_interval % num_envs == 0.

sen-pai commented 3 years ago

@muupan I have updated update_interval to satisfy the condition and it is working properly now. Thanks!

pfnet / pfrl

PPO stack.pop() from empty list ? #136