Closed aceofgreens closed 1 year ago
Hello,
Thank you for your attention and inquiry.
The steady increase in process memory is primarily due to the continuous addition of collected data into the replay buffer, causing the memory occupation of the replay buffer to rise persistently. We can make a rough estimate: 12*96*96/1024/1024/1024*1e6 ≈ 103GB
. Here, 1e6
is the default capacity/size of the replay buffer.
You can indeed control memory usage by modifying the configuration settings.
If the above methods cannot solve the problem, you might need to consider increasing the memory capacity of your system.
Best wishes.
Summary of issue
The training process gets killed by the kernel. There is a log in
dmesg
stating that the reason is "out of memory".Model: MuZero with self-supervision Environment: Pong Architecture is exactly the same as the default one for Atari envs except that:
(B, 12, 96, 96)
with 4 stacked frames)representation
networkThe process gets killed after 40k iteration steps (a bit more than 500k environment steps). The
Buffer/memory_usage/process
log shows that the total memory used starts from 0 and increases a bit faster than linearly to 6e+4, after which the process is killed.NOTE: I have been able to reproduce the "Quick Start" training run on Pong with the default config. No issue there.
General questions: