Enable GPU utiliziation if needed

Baylus commented 4 months ago

From what I understand, there are some optimizations that can be done to improve the time it takes to train the DQN by utilizing the gpu rather than the cpu. This optimization primarily would come from the tensorflow library, as no other have the gpu enabled functionality.

I went through a bit of work to get the ability for tensorflow to see the device itself, so I am hoping that this will lead to some speed ups. Had to get WSL all updated and configured and some monkeying around with the code to get it to work with python 3.9 (I could probably upgrade linux python too, but last time I tried that I bricked my whole python ecosystem and it took me like half a day to get it all back together on windows. So I will be delaying that). Going to be timing them both on how long it takes for them to train 50 episodes. Hopefully this will be faster.

Not sure if my memory is just bad, but I thought that the training steps used to take 19ms, but they are only taking 10-11 now, so that seems like a good sign.

Baylus commented 4 months ago

So, unfortunately I didnt bring over the verbosity silencing on the predict for each move step, which is insanely important for speed of execution. So my testing is not going to be super accurate. Also, I foolishly forgot to protect my duration printing from exceptions, so if I keyboard interrupt now, I wont get the program's runtime that way.

For reference, it is 12:40, 7/29, I started this run at 16:33, 7/27. We have gotten through 32 training games in about 42 hours. So, 0.75 games per hour.

Baylus commented 4 months ago

After finishing this run, we got this result for the linux run, which MAY be utilizing the gpu. Theres also a slight difference in the actual processing time, which is good to see that it may be accounting for the actual total processing time and not just how long the program has been running.

We took 1:17:08:54 to finish the linux "GPU" training.

Baylus commented 4 months ago

So I had this stupid mouse bug again, so I had to stop the run. The run got to episode 43. The wildest thing, the standard time delta took 1:12:59:23, but the processed time was 1:14:57:44... Meaning that the process was running for longer than the actual difference in time...

Anyway, being the most gracious, using the lower of the two numbers, the windows run processed 1.13 episodes per hour, where we know the last 7 generations take the longest than the other generations. The linux version takes 1.22 episodes per hour. So its a marginal difference, but still worth doing, unless there is some other extenuating improvement that is only available on windows.

Most notably of this would be running the parallel execution training. Though I can't think of a case that a platform issue would cause problems like that, but I was also surprised that windows could not take advantage of the GPU, so the world is surprising, it seems.

Baylus commented 3 months ago

So, super unfortunately I ran into my dreaded mouse bug and had to restart my computer to be able to select a different window. But we got through like 68 episodes, unfortunately its not saving the weights like I would like in case of keyboard interrupts, so I will mark that down as a TODO.

Apparently our duration was 2 days, 17:52:13.710002

Heres our graph of what our fitness is looking like. fitness graph

We are again seeing the game timeout more and more as time goes on. Hopefully the training improves that trend as it starts to learn more.

Baylus commented 3 months ago

I am going to squash and merge though and start a new branch for this testing.

Baylus / 2048I

Enable GPU utiliziation if needed #9