leVirve / All-Pair-Shortest-Path-CUDA

Accelerate algorithm block all-pair shortest path through CUDA
MIT License
6 stars 4 forks source link

__syncthreads needed in asps_cuda.cu pass 2? #2

Open aleaverfay opened 2 years ago

aleaverfay commented 2 years ago

I have been reading the Katz and Kider 2008 paper and believe you are implementing the algorithm that they describe.

They only talk about synchronization for the first pass of their algorithm, but I believe there also needs to be synchronization between threads in the second pass, too. (If you don't think so, could you explain to me why you wouldn't?)

This would mean you need some __syncthread calls beneath lines 60, 66 in apsp_cuda.cu (and in similar positions in the other two implementations). I believe this may explain the inaccurate results you were getting in your report.pdf.

leVirve commented 2 years ago

Wow, thanks for your reminder! I don't know the paper All-Pairs-Shortest-Paths for Large Graphs on the GPU when doing this assignment. 😊

I agree with you that it might need some synchronization barriers after modifying the shared memory like the first pass. However, I'm not working on CUDA programming for a long time. If I have time, I'd love to reproduce the issue, and thank you for the insight.