Reduce last warp, the shared memory sdata shoule be set volatile

arthurxlw / cytonRL

reinforcement learning, deep Q-network, double DQN, dueling DQN, prioritized experience replay

Apache License 2.0

30 stars 6 forks source link

Open hitblackjack opened 4 years ago

hitblackjack commented 4 years ago

DeviceMatrix.cu line 381 if(tid < warpSize){ for (size_t shift = warpSIze; shift>0;shift>>=1) sdata[tid] += sdata[tid +shift];

without ”__syncthreads()“ here, the shared memory sdata should be set volatile.