ponweist / Wannier90-PRACE

Optimizations for Wannier90 (fork repository - see http://wannier.org for the official version).
GNU General Public License v2.0
1 stars 0 forks source link

Use BLAS for utility_rotate_diag #10

Closed ponweist closed 10 years ago

ponweist commented 10 years ago

The routine utility_rotate_diag (utility.F90) accounts for 33% of the total runtime of the 16sm testcase, running with the following parameters:

kpath = F

kslice = T
kslice_task=fermi_lines
kslice_fermi_lines_colour=spin
kslice_2dkmesh = 100 100
!below is 0.0  0.0  1/8 half of L point
kslice_corner = 0.25  0.0  0.25
kslice_b1 =     1.0  1.0  0.0
kslice_b2 =     0.0  1.0  1.0

berry = F

The code of utility_rotate_diag is

  function utility_rotate_diag(mat,rot,dim)
    !===========================================================!
    !                                                           !
    ! Rotates the dim x dim matrix 'mat' according to           !
    ! (rot)^dagger.mat.rot, where 'rot' is a unitary matrix.    !
    ! Computes only the diagonal elements of rotated matrix.    !
    !                                                           !
    !===========================================================!

    use w90_constants, only : dp

    integer          :: dim
    complex(kind=dp) :: utility_rotate_diag(dim)
    complex(kind=dp) :: mat(dim,dim)
    complex(kind=dp) :: rot(dim,dim)

    utility_rotate_diag=utility_matmul_diag(matmul(transpose(conjg(rot)),mat),rot,dim)

  end function utility_rotate_diag

BLAS should be used for the matrix product.

ponweist commented 10 years ago

utility_rotate_diag is now 5 times faster (from 1e11 CPU cycles to 2e10 for 16sm case; see also #8).

Outputfile 16sm-kslice-fermi-spn.dat did not change.