Closed vmx closed 2 years ago
For easier review: here is the code that shows that in bellperson we were using 64-bit limbs for OpenCL prior to the refactoring: https://github.com/filecoin-project/bellperson/blob/f9a562f8f4d4f78f9ca1bc65ffcc34034aa14c4a/src/gpu/program.rs#L81
The refactoring that moved code from bellperson into this library lead to a regression. Accidentally OpenCL was using 32-bit limbs, which should lead to less performance than using 64-bit limbs.
The CUDA code path was always using the expected 32-bit limbs. Even if it doesn't look like it from this change, it was the case as the CUDA kernels are compiled at compile-time, the limb size is correctly specified in build.rs.