lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
287 stars 94 forks source link

STRICT build of dslash_mdw_fused_* fails with sm_86 #1403

Closed jcosborn closed 11 months ago

jcosborn commented 11 months ago

A STRICT build using sm_86 with MULTIGRID on fails with: Building CUDA object lib/CMakeFiles/quda.dir/dslash_mdw_fused_ls20.cu.o ptxas error : Value of threads per SM for entry _ZN4quda10raw_kernelINS_18mobius_tensor_core17FusedMobiusDslashENS1_14FusedDslashArgIsLi3EL21QudaReconstructType_s8ELi20ELNS19MdwfFusedDslashTypeE4ELi32ELi3ELb0EEELb0EEEvT0 is out of range. .minnctapersm will be ignored

maddyscientist commented 11 months ago

@hummingtree can you take a look?

hummingtree commented 11 months ago

@jcosborn This is due to SM 86, 87 and 89 only allow a maximum number of 1536 (as supposed to 2048) per SM. I will have a PR to fix this. Meanwhile you can disable this part of the code by having -D QUDA_MDW_FUSED_LS_LIST="" as part of the cmake parameters, which would decrease your compile time by quite a bit I expect.