Open ashutoshpndy opened 1 year ago
Hi,
To make pre-training faster, we store image features in CPU memory. Besides, pre-fetching images also take CPU memory. The CPU memory usage is related to num_workers you use. Using smaller num_workers will use less CPU memory, but the pre-training speed will be much slower, or you might want to not store image features in CPU memory to save some memory.
Best, Jialu
Hi, Jialuli Thank you so much for the reply and sharing insights regarding per-training optimization.
Thank you for your significant contributions to Vision and Language Navigation.
I've been utilizing the bash pretrain_src/scripts/pretrain_r2r.bash script to pre-train the given 9 tasks. However, I've noticed a consistent rise in CPU memory consumption with each training iteration. By the time it reaches around 70,000 steps, it depletes my CPU's 128 GB memory, resulting in an Out Of Memory (OOM) error. It's worth noting that while I'm training on GPU devices, the OOM issue is occurring with my CPU memory.
Could you provide any insights or potential solutions to this problem? I eagerly await your guidance.
Thank you.