Open huiwenzhang opened 1 year ago
@huiwenzhang Not such issue has been reported before. Maybe you could check whether your pytorch and CUDA version are compatible. Sometimes that could have an effect on the memory consumption.
@huiwenzhang Not such issue has been reported before. Maybe you could check whether your pytorch and CUDA version are compatible. Sometimes that could have an effect on the memory consumption.
I used pytorch version 2.0.1 with cuda version 11.8. The local cuda version is 12.1. According to the official doc of pytorch, newer cuda version is also supported. Besides, I didn't use GPU as you suggested. But the problem still exist. Training with cadrl and rgl policy is fine. Do you have any other guess about the memory leak?
@huiwenzhang I see. I don't have a clue what could be causing the issue. You could debug by removing all codes and adding back parts by parts until the issue occurs.
Hi, when running the multi-human policy, such as sarl, lstm-rl, I noticed that there is drastic memory increase with training going on. The used memory increased from about 4G to 20G after 100 episodes training. I debug for a long time, but still no clue about what's going wrong there. @ChanganVR Pls have a look.