Closed brigb123 closed 9 years ago
Up to 63 Registers are used in a thread at a time. A common max registers per block is 32768. 32768 / 63 is a max of 520 threads per block, which is often seen passed in multiple runs. Clamping threads/block to < max registers per block / 63 will need to be implemented.
OR the files using too many registers may need modifications to use fewer.
completed with f1d334d5b7e904b29579073c3d7236ed884f77f4
Master still has runtime CUDA errors. develop branch might solve this.