Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
MIT License
1.09k stars 186 forks source link

Memory leak during GPU training #2

Closed erschmidt closed 6 years ago

erschmidt commented 6 years ago

Hi,

I'm not sure if this is a PyTorch specific issue but during training my GPU memory is increasing immensely. For my experiment I'm using a high number of samples to optimize my discriminator and policy (~ 4.000 samples, where state is a 51 dimensional vector and action is a 2 dimensional vector). The network trains fine at first but runs into a out of memory exception after a few epochs. This happens for both TRPO and PPO.

I can reproduce this issue by using your gail.gail_gym script and changing the argument min-batch-size to 100.000 (the problem probably exists with smaller batch sizes as well - but this makes the increase more obvious).

The memory consumption starts with 900 MB after the first epoch (which seems reasonable) but increases to 1.300 MB after the third and to 1.700 after epoch ~ 160. Note that the increase is not constant but happens at once after an arbitrary number of epochs.

I tried to del variables after usage but this does not solve it.

I'm using CUDA 7.5 and PyTorch 3.0.

Thank you very much!

erschmidt commented 6 years ago

I think I fixed the problem by calling torch.cuda.empty_cache() at the end of update_params().

The memory allocates 400 MB for the network, peaks to 900 MB during training and then goes back to 400 again.

I guess this is a PyTorch related issue since a lot of people are complaining about this kind of problem.

Hope that helps other people as well.

Khrylx commented 6 years ago

Thank you for finding this issue!