nguyen-group / QERaman

A open-source program for computing the first-order resonance Raman spectroscopy based on Quantum ESPRESSO
https://doi.org/10.1016/j.cpc.2023.108967
GNU General Public License v3.0
11 stars 5 forks source link

Issue Encountered when Running ph_mat.x with -npool option #13

Open MouadBik opened 6 months ago

MouadBik commented 6 months ago

Dear developers,

I am reaching out to report an issue encountered while using the QERaman package. Specifically, when attempting to execute the ph_mat.x and raman.x executables with Quantum Espresso versions 7.2, 7.1, and even the newer 7.3, I consistently encountered errors resulting in segmentation faults and end-of-file errors.

To provide some context, I employed the following command: mpirun -np 64 ph_mat.x -npool 32 -in ph.in > ph.out

Regardless of whether I used the gcc or ifort versions, the issue persisted. Below, I've outlined the errors encountered:

Segmentation fault error: error: forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libpthread-2.28.s 0000151182442CF0 Unknown Unknown Unknown ph_mat.x 0000000001125CEF Unknown Unknown Unknown ph_mat.x 0000000001122CF0 Unknown Unknown Unknown ph_mat.x 00000000011226A0 Unknown Unknown Unknown ph_mat.x 00000000010DF15C Unknown Unknown Unknown ph_mat.x 00000000010DD451 Unknown Unknown Unknown ph_mat.x 0000000000410671 elphfilepc 169 supp.f90 ph_mat.x 000000000040EA91 dophonon2 153 do_phonon2.f90 ph_mat.x 000000000040886D MAIN__ 78 phonon2.f90 ph_mat.x 00000000004087CD Unknown Unknown Unknown libc-2.28.so 00001511818D5D85 __libc_start_main Unknown Unknown ph_mat.x 00000000004086EE Unknown Unknown Unknown

End-of-file error: forrtl: severe (24): end-of-file during read, unit 102, file /Raman/ifort-nk/./Si.elph Image PC Routine Line Source raman.x 0000000000432767 Unknown Unknown Unknown raman.x 00000000004137B2 raman_read_mp_rea 230 raman_read.f90 raman.x 0000000000404902 MAIN 52 raman.f90 raman.x 00000000004041CD Unknown Unknown Unknown libc-2.28.so 000015227FAF1D85 libc_start_main Unknown Unknown raman.x 00000000004040EE Unknown Unknown Unknown

I attempted to isolate the problem and found that the errors were not encountered when using the following simplified command: mpirun -np 64 ph_mat.x -in ph.in > ph.out

This suggests that the issue might be related to the -npool option.

I kindly request your assistance in diagnosing and resolving this issue. If further information or assistance is required from my end, please do not hesitate to reach out. Your prompt attention to this matter would be greatly appreciated.

Thank you for your time and support.

Best regards, Mouad Bikerouin

nguyen-group commented 6 months ago

Dear Mouad Bikerouin,

Thank you for your report on an important issue. You are correct that the modified code ph_mat.x does not support running mpi with -npool option. I will check how to modify the ph.x to cover the -npool option. The newest QE 7.3 has a new tag on printing electron-phonon output. It might help to upgrade the new version of QERaman to cover the -npool option.

Best regards, Nguyen

MouadBik commented 6 months ago

Dear Nguyen,

Thank you for your prompt response and for acknowledging the issue I reported.

I appreciate your confirmation regarding the compatibility of the modified ph_mat.x code with the '-npool' option in MPI. I understand that modifying the 'ph.x' code to support the '-npool' option may require some adjustments.

Regarding your suggestion to use QERaman with the latest version of Quantum Espresso 7.3, I tried to do that but encountered the same problems. Could you please provide more detailed guidance on how to upgrade the new version of QERaman to accommodate the '-npool' option?

Thank you once again for your attention to this matter and for your ongoing support.

Best regards, Mouad Bikerouin

nguyen-group commented 6 months ago

Hi Mouad Bikerouin,

In QE 7.3, I found a new keyword in PH/phq_readin.f90, electron_phonon = 'prt'. It can print the absolute value of electron-phonon matrix for all band index and phonon modes. For Raman, we need the complex values, compared with QE 7.2, it might be easy to modify key electron_phonon = 'prt' in QE 7.3 to obtain el-ph matrix elements. But first, we should check if the electron_phonon = 'prt' of QE 7.3 supports the '-npool' option or not. If it supports, I think we can do a modification based on the electron_phonon = 'prt'. SUBROUTINE elph_prt() can be found in PH/elphon.f90

Best regards, Nguyen

MouadBik commented 1 month ago

Dear Nguyen,

I have resolved the issue after several attempts. The solution involves correcting lines 167 and 168 in 'supp.f90'. The lines should be updated as follows:

ikk = ikks_collect(ik) ikq = ikqs_collect(ik)

With these changes, the '-nk' option should function correctly, resulting in a reduction in calculation time by approximately a factor of three or more. I think it would be beneficial to update the code so that future users can make use of the -nk parallelization.

Best regards, Mouad Bikerouin

nguyen-group commented 1 month ago

Dear Mouad Bikerouin,

Thank you very much for your efforts. We appreciate it and will definitely update you with an acknowledgment. After doing the testing for double-checking, we will make an update for version 1.1.

Best regards, Nguyen