Benchmark against nVidia's XMP library

unzvfu / cuda-fixnum

Extended-precision modular arithmetic library that targets CUDA.

MIT License

32 stars 7 forks source link

Benchmark against nVidia's XMP library #46

Open unzvfu opened 4 years ago

unzvfu commented 4 years ago

See https://nvlabs.github.io/xmp/

dave-andersen commented 3 years ago

XMP 2.0 is called CGBN: https://github.com/NVlabs/CGBN A quick benchmark suggests CGBN is about 2.5 times faster on Titan-V but I didn't cuda-fixnum, just changed the arch to 70 and ran bench, for a 1024 / 1024 modular exponentiation.

unzvfu commented 3 years ago

Thanks @dave-andersen. If I recall correctly cuda-fixnum was comparable in performance to XMP around the time I first wrote it, but I haven't given it much attention for a while; it really needs to be updated to make use of more recent architecture improvements. Great to see that CGBN is performing so well!