XuGW-Kevin / DrM

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.
MIT License
53 stars 8 forks source link

Memory leak? #3

Closed zichunxx closed 3 months ago

zichunxx commented 3 months ago

Hi! Thanks for your generous sharing!

DrM builds upon Drqv2 and uses the same structure of the replay buffer, which overrides pytorch IterableDataset. During training, I met a similar memory leak problem, i.e., the memory is occupied by almost 60GB with 3,000,000 steps, and the training process is killed. I have the max 64GB RAM but not sure if it's enough for this kind of baseline.

Have you noticed the excessive memory increase during training? Could you share your RAM capacity?

Looking forward to your suggestions, thanks!

XuGW-Kevin commented 3 months ago

Hi @zichunxx , Thank you for reaching out! Indeed, DrM builds upon Drqv2 and uses the same structure for the replay buffer. In my experience, training in the dmc environment does require approximately 60GB per hard task. We utilize a server with 8 A40 GPUs and 1TB of DRAM. Typically, training with 8 environments consumes around 512GB of DRAM. (Training Drqv2 on the same environment will also require such amount of DRAM.) I hope this information helps. If you have any more questions or need further assistance, please feel free to ask.

zichunxx commented 3 months ago

Thanks for your explanation!

We utilize a server with 8 A40 GPUs and 1TB of DRAM. Typically, training with 8 environments consumes around 512GB of DRAM.

I don't have such a high-performance workstation to train these tasks. So I tried to change the load/save mode of the replay buffer. For now, the replay buffer first saves episodes to disk and then loads them in memory. I want to save all episodes to disk to release the memory pressure. But I'm not sure if it is the right way to do it. I hope to get some suggestions from an expert like you. Thanks.

XuGW-Kevin commented 3 months ago

Thanks for raising this issue! Your approach to saving all episodes to disk to release memory pressure is a viable method, especially when high-performance hardware is not available. However, this will indeed make the training process slower.

zichunxx commented 3 months ago

Ok, thanks for your patience!