PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.6k
stars
829
forks
source link
ValueError if batch size is smaller than number of mini batches #72
The following line returns zero and throws a ValueError if the batch size resulting from the number of processes and steps is smaller than the number of mini batches. This happens to me especially in for the PPO case where by default num_mini_batch=32. Maybe one could catch this before the training starts and return a more descriptive error. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr/blob/552eb86bbc3e338a898d4adcc506d63fdbabebe0/storage.py#L69
Could you submit a pull request? You can just add an assertion in the beginning of the main file. I will be extremely busy in the next couple of weeks, but it's a reasonable fix to add.
The following line returns zero and throws a
ValueError
if the batch size resulting from the number of processes and steps is smaller than the number of mini batches. This happens to me especially in for the PPO case where by defaultnum_mini_batch=32
. Maybe one could catch this before the training starts and return a more descriptive error. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr/blob/552eb86bbc3e338a898d4adcc506d63fdbabebe0/storage.py#L69