Open findmyway opened 3 years ago
Actually Q3 answers Q1, we don't need to transfer data between CPU and GPU in most cases.
Does this mean storing the BitArray{3} on the GPU so that the state can directly be taken to the neural network weight stored on the GPU?
Yes
Where does environment logic get executed - CPU or GPU?
GPU
I have thought more deeply about it.
From what I understand, if we are to use the GPU, then the env
instance would sit on the GPU and all environment related computations will happen there so that the neural network can relatively easily access the state.
I want to know if it would be worth doing the env
logic like taking actions (that doesn't have much parallelism in it) on the GPU vs. doing everything on the CPU and moving data between CPU and GPU at each step.
Let
n
= total number of steps required to be executed in env
in order to train a policy from scratch
x
= avg. cost of env
logic per step on GPU
y
= avg. cost of env
logic per step on CPU (potentially implemented in multithreaded fashion, which would give even more performance than presently in benchmark.md) + avg. cost of moving state from CPU to GPU + avg. cost of moving computed action from GPU to CPU to be executed in the env
.
z
= avg. total cost of fully training a policy in env
on GPU from scratch excluding env
logic.
Ideally, we would want to use the GPU if:
(n*y)/z > (n*x)/z
If LHS is significantly greater than RHS, then we can justify this feature. Correct me if I am wrong, but it is not obvious to me that this equation holds.
Even more importantly, if (n*x)/z << 1
, that is, the total cost of env
logic on CPU is much less than total cost of training on GPU excluding env
logic, then we won't be gaining much by incorporating GPU support. My initial hunch is that this would hold true because the env
logic is quite simple and should cost way less than training a neural network. I have left out reset!
, assuming that it will get ammortized over the total number of steps and isn't significantly costly overall.
What do you think?
It seems someone has already done something related:
https://discourse.julialang.org/t/alphagpu-an-alphazero-implementation-wholly-on-gpu/60030
Thanks for pointing it out! By the way, I won't be able to work on GPUization anytime soon.
@findmyway This would be my first time working with GPU programming (or any form of concurrent programming for that matter). So I have a few question:
BitArray{3}
on the GPU so that the state can directly be taken to the neural network weight stored on the GPU? Where does environment logic get executed - CPU or GPU?