ARCC-RACE / deepracer-for-dummies

a quick way to get up and running with local deepracer training environment
66 stars 28 forks source link

Memory Leak when training with GPU #60

Closed albertsundjaja closed 4 years ago

albertsundjaja commented 4 years ago

Hello, after a certain number of episodes (around 400 in my case), the training stops because the GPU run out of memory, it seems that the problem is in tensorflow training

I'm trying to find how to replicate the problem, but it seems that it happens randomly at the moment

EDIT: it seems that this is not memory leak. As the training episodes increase, the agent is able to cover more track, and thus the training data gets big and the GPU can't handle it