unzvfu / cuda-fixnum

Extended-precision modular arithmetic library that targets CUDA.
MIT License
34 stars 8 forks source link

Enforce exponent sharing in sliding-window modexp function #23

Open unzvfu opened 4 years ago

unzvfu commented 4 years ago

From https://github.com/data61/cuda-fixnum/issues/41:

At the moment the exponent window array is mallocated once per slot (see modexp<...>::modexp(...)), whereas it doesn't make a lot of sense to use the function unless all the exponents in the warp (or even the thread block) are the same.

Also, mallocing all that data is computationally expensive.

Also it might blow the 8MB default heap size, which would require manually managing the heap size from outside the modexp function call, which would be a pain in the neck.