Newer versions of nvcc have a -arch=all-major option that compiles code for all sm_?0 architectures. Since we still need to support earlier nvcc versions, this PR does the same in a roundabout way for fp16 code, and should considerably speed up compilation.
Newer versions of nvcc have a
-arch=all-major
option that compiles code for allsm_?0
architectures. Since we still need to support earlier nvcc versions, this PR does the same in a roundabout way for fp16 code, and should considerably speed up compilation.