supranational / sppark

Zero-knowledge template library
Apache License 2.0
183 stars 64 forks source link

Race condition in msm/sort.cuh #24

Closed web3hulei closed 11 months ago

web3hulei commented 11 months ago

There is a race condition between lines 267 and 278 in the msm/sort.cuh file. If a warp with a larger warpid executes line 278 before the warp with warpid=0 executes line 267, the calculation result will be wrong. In fact, on the ampere architecture, due to the scheduling strategy of the warp scheduler, it is impossible for warp (id>0) to execute line 278 before warp(id=0) executing line 278 and therefore the test always passes. However, there is indeed a logical error.

dot-asm commented 11 months ago

Thanks! I have to ask, did you spot it by eyeballing or is there some tool that spots potential race conditions? If the former, I have to praise the keenness, and if the latter, I'd like to know which one is it:-) Thanks again!

web3hulei commented 11 months ago

I spotted it by eyeballing

web3hulei commented 11 months ago

In addition, to correct my previous statement, in fact, on the Ampere architecture, warps in the same block (1024 threads) may not be scheduled by the same warp scheduler, so the scheduling strategy of the warp scheduler may not necessarily guarantee that warps (id>0) will update the counters after warps (id=0) read them.