NVlabs / GA3C

Hybrid CPU/GPU implementation of the A3C algorithm for deep reinforcement learning.
BSD 3-Clause "New" or "Revised" License
652 stars 195 forks source link

Training Slowdown #12

Closed djl11 closed 7 years ago

djl11 commented 7 years ago

The issue is documented here, but I was wondering if you ever had any problems, receiving messages like this during training:

tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1648707130 get requests, put_count=1648707127 evicted_count=2741000 eviction_rate=0.00166251 and unsatisfied allocation rate=0.00166258

I get this message quite often when cloning this repo and running, untouched, on pong. The issue seems worse when training on a custom pygame I made, and one time the training ground to a stop completely, with no more output to results.txt, and the console full of these messages.

If you have never had problems like this with your network, then I will close the issue. Otherwise, any, advice would be greatly appreciated.

ppwwyyxx commented 7 years ago

It's not a warning (i.e. not a bad thing to see this message).

djl11 commented 7 years ago

Thanks. Will need to further investigate the training slowdown then, If nobody has experienced any problems with training slowing down (and stopping), with a plethora of these messages in the terminal, then I will close the issue later.

mbz commented 7 years ago

This message may show up time to time (not with high frequency though) and it's normal as @ppwwyyxx mentioned. I've never experienced such slowdown with default code. Which number you are looking at to check the speed?

djl11 commented 7 years ago

Sorry, I should really check things further before posting. I now know I wasn't getting any output to the results.txt file as my learning started to diverge, and the agent got stuck in infinite non-terminal loops of gameplay in my game. Falsely presumed it was related to this message which filled my console, and I thought it was related to this.. Will close now, and in future, investigate a bit more before posting! Thanks.