Open unzvfu opened 4 years ago
XMP 2.0 is called CGBN: https://github.com/NVlabs/CGBN A quick benchmark suggests CGBN is about 2.5 times faster on Titan-V but I didn't cuda-fixnum, just changed the arch to 70 and ran bench, for a 1024 / 1024 modular exponentiation.
Thanks @dave-andersen. If I recall correctly cuda-fixnum
was comparable in performance to XMP around the time I first wrote it, but I haven't given it much attention for a while; it really needs to be updated to make use of more recent architecture improvements. Great to see that CGBN is performing so well!
See https://nvlabs.github.io/xmp/