Closed ponweist closed 9 years ago
Testcases and runtimes after f0ba9074f381cb489fccd0d1d2e3660a01c9ae15:
Parameters:
kpath = F
kslice = T
kslice_task=fermi_lines,curv
kslice_2dkmesh = 100 100
!below is 0.0 0.0 1/8 half of L point
kslice_corner = 0.25 0.0 0.25
kslice_b1 = 1.0 1.0 0.0
kslice_b2 = 0.0 1.0 1.0
berry = F
Runtime: 133s
Parameters:
kpath = F
kslice = T
kslice_task=fermi_lines,morb
kslice_2dkmesh = 100 100
!below is 0.0 0.0 1/8 half of L point
kslice_corner = 0.25 0.0 0.25
kslice_b1 = 1.0 1.0 0.0
kslice_b2 = 0.0 1.0 1.0
berry = F
Runtime: 207s
Parameters:
kpath = F
kslice = T
kslice_task=fermi_lines
kslice_fermi_lines_colour=spin
kslice_2dkmesh = 100 100
!below is 0.0 0.0 1/8 half of L point
kslice_corner = 0.25 0.0 0.25
kslice_b1 = 1.0 1.0 0.0
kslice_b2 = 0.0 1.0 1.0
berry = F
Runtime: 102s
Parameters:
kpath = F
kslice = T
kslice_task=fermi_lines
kslice_fermi_lines_colour=none
kslice_2dkmesh = 100 100
!below is 0.0 0.0 1/8 half of L point
kslice_corner = 0.25 0.0 0.25
kslice_b1 = 1.0 1.0 0.0
kslice_b2 = 0.0 1.0 1.0
berry = F
Runtime: 57s
Runtime improvement from 102s to 67s for Testcase C, after optimization of utility_rotate_diag
bbdd774710b20cd658c3d31bd97cbf0f2a975aa1.
Output files for all 4 testcases are now identical again comparing with those from the reference runs f0ba9074f381cb489fccd0d1d2e3660a01c9ae15.
Paralleization now done in new branch "iss8"; changes should be merged back to master branch after fixing #13.
Output files for all 4 testcases are identical compared to those of f0ba9074f381cb489fccd0d1d2e3660a01c9ae15.
Timings:
Testcase | Original | New |
---|---|---|
A | 133s | 19.5s |
B | 207s | 21.5s |
C | 102s | 15.2s |
D | 57s | 6.4s |
Trace for testcase C:
The following loop (kslice.F90) needs to be parallelized: