arthurxlw / cytonRL

reinforcement learning, deep Q-network, double DQN, dueling DQN, prioritized experience replay
Apache License 2.0
30 stars 6 forks source link

Reduce last warp, the shared memory sdata shoule be set volatile #3

Open hitblackjack opened 4 years ago

hitblackjack commented 4 years ago

DeviceMatrix.cu line 381 if(tid < warpSize){ for (size_t shift = warpSIze; shift>0;shift>>=1) sdata[tid] += sdata[tid +shift];

without ”__syncthreads()“ here, the shared memory sdata should be set volatile.

reference: https://stackoverflow.com/questions/21205471/cuda-in-warp-reduction-and-volatile-keyword