yfeng997 / MadMario

Interactive tutorial to build a learning Mario, for first-time RL learners
209 stars 72 forks source link

GPU Speed Up Benchmark #1

Closed yfeng997 closed 4 years ago

yfeng997 commented 4 years ago

Here we compare training time on a Macbook Pro (CPU) vs. Google Colab (GPU). In the below terminal outputs, pay attention to the Step Time. It is the average iteration time including act(), step(), learn() and remember().

Macbook Pro CPU

Episode 20 - Step 3603 - Step Time 0.065 - Epsilon 0.999 - Mean Reward 578.905 - Mean Length 171.571 - Mean Loss 2.127 - Mean Q Value 4.123 - Time 2020-06-05T20:23:33
Episode 21 - Step 3643 - Step Time 0.066 - Epsilon 0.999 - Mean Reward 563.091 - Mean Length 165.591 - Mean Loss 2.056 - Mean Q Value 4.087 - Time 2020-06-05T20:23:36
Episode 22 - Step 4097 - Step Time 0.068 - Epsilon 0.999 - Mean Reward 581.696 - Mean Length 178.13 - Mean Loss 1.994 - Mean Q Value 4.063 - Time 2020-06-05T20:24:06
Episode 23 - Step 4195 - Step Time 0.07 - Epsilon 0.999 - Mean Reward 583.542 - Mean Length 174.792 - Mean Loss 1.934 - Mean Q Value 4.041 - Time 2020-06-05T20:24:13
Episode 24 - Step 4235 - Step Time 0.071 - Epsilon 0.999 - Mean Reward 569.44 - Mean Length 169.4 - Mean Loss 1.877 - Mean Q Value 4.019 - Time 2020-06-05T20:24:16
Episode 25 - Step 4493 - Step Time 0.068 - Epsilon 0.999 - Mean Reward 576.231 - Mean Length 172.808 - Mean Loss 1.824 - Mean Q Value 4.001 - Time 2020-06-05T20:24:34

Google Colab GPU

Episode 41 - Step 9149 - Step Time 0.018 - Epsilon 0.998 - Mean Reward 733.976 - Mean Length 217.833 - Mean Loss 0.859 - Mean Q Value 2.953 - Time 2020-06-06T03:22:58
Episode 42 - Step 9568 - Step Time 0.019 - Epsilon 0.998 - Mean Reward 742.163 - Mean Length 222.512 - Mean Loss 0.848 - Mean Q Value 2.963 - Time 2020-06-06T03:23:06
Episode 43 - Step 10622 - Step Time 0.019 - Epsilon 0.997 - Mean Reward 745.114 - Mean Length 241.409 - Mean Loss 0.842 - Mean Q Value 3.004 - Time 2020-06-06T03:23:26
Episode 44 - Step 10662 - Step Time 0.019 - Epsilon 0.997 - Mean Reward 733.689 - Mean Length 236.933 - Mean Loss 0.837 - Mean Q Value 3.054 - Time 2020-06-06T03:23:26
Episode 45 - Step 10702 - Step Time 0.018 - Epsilon 0.997 - Mean Reward 722.761 - Mean Length 232.652 - Mean Loss 0.832 - Mean Q Value 3.104 - Time 2020-06-06T03:23:27
Episode 46 - Step 10828 - Step Time 0.018 - Epsilon 0.997 - Mean Reward 719.851 - Mean Length 230.383 - Mean Loss 0.827 - Mean Q Value 3.16 - Time 2020-06-06T03:23:30

We see a speed up of ~4x by training on Colab GPU.