OrderN / CONQUEST-release

Full public release of large scale and linear scaling DFT code CONQUEST
http://www.order-n.org/
MIT License
94 stars 24 forks source link

Refactor duplicated code in m_kern_exx_ subroutines #346

Open tkoskela opened 2 weeks ago

tkoskela commented 2 weeks ago

Based on our meeting with @lionelalexandre on April 12, we need to investigate how much we can reduce code duplication in the subroutines

https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L921

https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L1268

https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L1649

https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L2019

and do some refactoring before carrying on with the parallelisation work

connoraird commented 2 weeks ago

After comparing m_kern_exx_cri and m_kern_exx_eri, it does not look simple to reduce their duplication through refactoring. Many similar things are happening in both but the loop structures are different. For example in m_kern_exx_cri there is the loop do l = 1, nbnab(k_in_part) and inside this loop K_val and Phy_k are set before the loop is closed. However, in m_kern_exx_eri this same loop contains the setting of K_val as well as the rest of the kernel.

Further down the loop nest, the loop structure differs again. In m_kern_exx_cri there are two nested loops over kg%nsup and jb%nsup but in m_kern_exx_eri the loop is not nested and is only over jb%nsup.

connoraird commented 2 weeks ago

After comparing m_kern_exx_eri and m_kern_exx_eri_gto, the only differences seem to be that arrays are allocated in m_kern_exx_eri but not m_kern_exx_eri_gto and that m_kern_exx_eri_gto calls compute_eri_hoh at it's deepest point whereas m_kern_exx_eri calls exx_ewald_charge and exx_v_on_grid and performs calculations in a triple nested loop.

Therefore, it should be possible to reduce code duplication between these two kernels by first moving array allocations in m_kern_exx_eri outside of the duplicated region.

yes, this is correct ; I should have done this before...

connoraird commented 2 weeks ago

m_kern_exx_eri and m_kern_exx_dummy are also very similar with the main difference being that arrays are allocated at the beginning of m_kern_exx_dummy. Therefore, this can also likely be refactored to deduplicate code.

lionelalexandre commented 2 weeks ago

m_kern_exx_eri and m_kern_exx_dummy are also very similar with the main difference being that arrays are allocated at the beginning of m_kern_exx_dummy. Therefore, this can also likely be refactored to deduplicate code.

I am not sure but you will need to had a if statement in the nested loops. Can it be a problem ?

lionelalexandre commented 2 weeks ago

After comparing m_kern_exx_eri and m_kern_exx_eri_gto, the only differences seem to be that arrays are allocated in m_kern_exx_eri but not m_kern_exx_eri_gto and that m_kern_exx_eri_gto calls compute_eri_hoh at it's deepest point whereas m_kern_exx_eri calls exx_ewald_charge and exx_v_on_grid and performs calculations in a triple nested loop.

Therefore, it should be possible to reduce code duplication between these two kernels by first moving array allocations in m_kern_exx_eri outside of the duplicated region.

yes, this is correct ; I should have done this before...

lionelalexandre commented 2 weeks ago

After comparing m_kern_exx_cri and m_kern_exx_eri, it does not look simple to reduce their duplication through refactoring. Many similar things are happening in both but the loop structures are different. For example in m_kern_exx_cri there is the loop do l = 1, nbnab(k_in_part) and inside this loop K_val and Phy_k are set before the loop is closed. However, in m_kern_exx_eri this same loop contains the setting of K_val as well as the rest of the kernel.

Further down the loop nest, the loop structure differs again. In m_kern_exx_cri there are two nested loops over kg%nsup and jb%nsup but in m_kern_exx_eri the loop is not nested and is only over jb%nsup.

connoraird commented 1 week ago

I believe the code inside the do nsf2 = 1, jb%nsup loop is the same for m_kern_exx_cri and m_kern_exx_eri. Therefore, this should be able to be pulled out into a single subroutine.