Open weinbe2 opened 5 months ago
Description in title, example for Nc 64 -> 96:
eweinberg$ cuobjdump --dump-resource-usage restrictor_64_96.cu.o Fatbin elf code: ================ arch = sm_80 code version = [1,7] host = linux compile_size = 64bit compressed Resource usage: Common: GLOBAL:19 Function _ZN4quda13BlockKernel2DINS_10RestrictorENS_14BlockKernelArgILj1ENS_11RestrictArgIffLi2ELi64ELi2ELi96ELb0EEEEELb0EEENSt9enable_ifIXclsr6deviceE14use_kernel_argIT0_EEEvE4typeES7_: REG:255 STACK:1280 SHARED:1024 LOCAL:0 CONSTANT[2]:8 CONSTANT[0]:3712 TEXTURE:0 SURFACE:0 SAMPLER:0 Function _ZN4quda13BlockKernel2DINS_10RestrictorENS_14BlockKernelArgILj1ENS_11RestrictArgIfsLi2ELi64ELi2ELi96ELb0EEEEELb0EEENSt9enable_ifIXclsr6deviceE14use_kernel_argIT0_EEEvE4typeES7_: REG:255 STACK:1280 SHARED:1024 LOCAL:0 CONSTANT[2]:8 CONSTANT[0]:3728 TEXTURE:0 SURFACE:0 SAMPLER:0
Reference command to compile:
cmake -DCMAKE_BUILD_TYPE=RELEASE -DQUDA_DIRAC_DEFAULT_OFF=ON -DQUDA_DIRAC_STAGGERED=ON -DQUDA_GPU_ARCH=sm_80 -DQUDA_DOWNLOAD_USQCD=ON -DQUDA_QIO=ON -DQUDA_QMP=ON -DQUDA_MULTIGRID=ON -DQUDA_MULTIGRID_NVEC_LIST="24,64,96" ../quda
For a quick copy+paste command to generate a well-behaved configuration and then do an MG solve that has 3 <-> 64 <-> 96 can be found here: https://github.com/lattice/quda/wiki/Staggered-Multigrid-Solver#quick-context-free-example-solve-command
Description in title, example for Nc 64 -> 96:
Reference command to compile:
For a quick copy+paste command to generate a well-behaved configuration and then do an MG solve that has 3 <-> 64 <-> 96 can be found here: https://github.com/lattice/quda/wiki/Staggered-Multigrid-Solver#quick-context-free-example-solve-command