SunnySuite / Sunny.jl

Spin dynamics and generalization to SU(N) coherent states
Other
79 stars 19 forks source link

Batched diagonalization on CUDA GPUs #219

Open kbarros opened 8 months ago

kbarros commented 8 months ago

CuSolve provides a function to perform batched diagonalization of Hermitian matrices: https://docs.nvidia.com/cuda/cusolver/index.html#cusolverdn-t-syevj

Performance benefits may depend a lot on matrix size, etc: https://discourse.julialang.org/t/eigenvalues-for-lots-of-small-matrices-gpu-batched-vs-cpu-eigen/50792

We could consider using this for accelerating LSWT. Note that for many LSWT calculations, especially in dipole mode, the diagonalization subroutine itself may not be the dominant cost. To make this beneficial, we would probably need to move a lot of the calculation onto the GPU (e.g., the matrix-builds for each q).