data61 / cuda-fixnum

Extended-precision modular arithmetic library that targets CUDA.
Other
41 stars 28 forks source link

Ensure arguments from user functions are always read into registers #35

Open unzvfu opened 6 years ago

unzvfu commented 6 years ago

Compiler is too dumb to do this apparently, causing a 3x slow-down on the mul_lo code. Not sure how to do it automatically using variadic parameter pack though...

unzvfu commented 4 years ago

Follow up at https://github.com/unzvfu/cuda-fixnum/issues/18.