simplify cuda architectures used

LeelaChessZero / lc0

The rewritten engine, originally for tensorflow. Now all other backends have been ported here.

GNU General Public License v3.0

2.38k stars 525 forks source link

simplify cuda architectures used #1885

Closed borg323 closed 1 year ago

borg323 commented 1 year ago

Newer versions of nvcc have a -arch=all-major option that compiles code for all sm_?0 architectures. Since we still need to support earlier nvcc versions, this PR does the same in a roundabout way for fp16 code, and should considerably speed up compilation.