Z2PackDev / Z2Pack

A tool for calculating topological invariants.
https://z2pack.greschd.ch
GNU General Public License v3.0
84 stars 51 forks source link

Runtime error diagnostics #212

Closed shahid-sattar closed 9 months ago

shahid-sattar commented 9 months ago

Dear Developers, New user to Z2pack and using VASP+wannier90, I am able to run test example Bi. However, in next step, there is an issue in doing calculation for a test case Bi2Se3. All inputs look okay, number of bands are consistent, but there is vasp segmentation fault issue. This was not the case in test example case. Please suggest a possible solution. POSCAR Bi2Se3 1.00000000000000 2.0882289007523460 1.2056395179789245 9.6214415845536436 -2.0882289007523460 1.2056395179789245 9.6214415845536436 -0.0000000000000001 -2.4112790359578411 9.6214415845536436 Bi Se 2 3 Direct 0.5997925669909683 0.5997925669909683 0.5997925669909683 0.4002074330090317 0.4002074330090317 0.4002074330090317 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.2124990312419863 0.2124990312419863 0.2124990312419863 0.7875009687580137 0.7875009687580137 0.7875009687580137

INCAR SYSTEM = Bi2Se3

ISMEAR = 0 SIGMA = 0.01 PREC = A ENCUT = 430 LREAL= F

LSORBIT=.TRUE. SAXIS= 0 0 1 GGA_COMPAT = .FALSE. ISYM=-1 LPEAD=.FALSE. LWANNIER90=.TRUE. LWRITE_MMN_AMN = .TRUE. NBANDS=62 NSW = 0 ALGO = N ICHARG = 11 NCORE=1 LWAVE = .FALSE.

Error message: +----------------------------------------------------------------------+ =================== SURFACE CALCULATION =================== starting at 2024-02-27 13:26:20,326 running Z2Pack version 2.2.0
gap_tol: 0.3
init_result: None
iterator: range(8, 27, 2)
load: True
load_quiet: True
min_neighbour_dist: 0.01
move_tol: 0.3
num_lines: 11
pos_tol: 0.01
save_file: ./results/res_0.p
serializer: auto
surface: <function at 0x7efd88ba23e0>
system: <z2pack.fp._first_prin<...>ject at 0x7efd7b12c590>

+----------------------------------------------------------------------+

INFO: Adding lines required by 'num_lines'. INFO: Adding line at t = 0.0 INFO: Calculating line for N = 8 forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source
vasp_ncl 00000000017C386D Unknown Unknown Unknown libc.so.6 00007F52C0054DB0 Unknown Unknown Unknown vasp_ncl 000000000065BA5B Unknown Unknown Unknown vasp_ncl 0000000000DC3E2F Unknown Unknown Unknown vasp_ncl 0000000000E35FA8 Unknown Unknown Unknown vasp_ncl 0000000001531A55 Unknown Unknown Unknown vasp_ncl 000000000150D07D Unknown Unknown Unknown vasp_ncl 000000000040B46E Unknown Unknown Unknown libc.so.6 00007F52C003FEB0 Unknown Unknown Unknown libc.so.6 00007F52C003FF60 __libc_start_main Unknown Unknown vasp_ncl 000000000040B369 Unknown Unknown Unknown Traceback (most recent call last): File "/proj/ltu-fy/users/x_shasa/codes/anaconda3/envs/z2pack/lib/python3.12/site-packages/z2pack/fp/_read_mmn.py", line 16, in get_m with open(mmn_file, "r") as f: ^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: '/proj/ltu-fy/users/x_shasa/coll/allenm/mnbite4/bulk/no-relax-bulk/soc/test2/build/wannier90.mmn'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/proj/ltu-fy/users/x_shasa/coll/allenm/mnbite4/bulk/no-relax-bulk/soc/test2/run.py", line 34, in result_0 = z2pack.surface.run(

greschd commented 9 months ago

vasp segmentation fault issue

This indicates a problem within VASP, or how it was compiled.

Unfortunately I cannot help with this problem, as I don't have access to a VASP license. I would encourage you to consult the corresponding VASP documentation / forum.

shahid-sattar commented 9 months ago

Hi Dominik, thanks for the response. As I mentioned, I did exactly the same method to reproduce example 1, It worked. I suspect that the problem is due to running vasp with 1 core here. I tried to run files in build directory and it goes in vasp loop but didn't run anything. May I ask how can I run z2pack in parallel in a slurm job script?

Screenshot from 2024-02-27 13-49-16

greschd commented 9 months ago

Z2Pack itself isn't made for running in parallel (it's a single Python process). But you can submit it to a node with multiple cores, and then call mpirun as part of the VASP command as is done in the example: https://github.com/Z2PackDev/Z2Pack/blob/832a24c9169faa620b3e03fcaaab919a8a5c9d19/examples/fp/vasp/Bi/run.py#L25

greschd commented 9 months ago

In general, I would expect the run on 1 core to be less prone to crashing, not more; unless it runs into resource limits. It will be slower of course, but not more error-prone.

This part of the output:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp_ncl 00000000017C386D Unknown Unknown Unknown
libc.so.6 00007F52C0054DB0 Unknown Unknown Unknown
vasp_ncl 000000000065BA5B Unknown Unknown Unknown
vasp_ncl 0000000000DC3E2F Unknown Unknown Unknown
vasp_ncl 0000000000E35FA8 Unknown Unknown Unknown
vasp_ncl 0000000001531A55 Unknown Unknown Unknown
vasp_ncl 000000000150D07D Unknown Unknown Unknown
vasp_ncl 000000000040B46E Unknown Unknown Unknown
libc.so.6 00007F52C003FEB0 Unknown Unknown Unknown
libc.so.6 00007F52C003FF60 __libc_start_main Unknown Unknown
vasp_ncl 000000000040B369 Unknown Unknown Unknown

I suspect comes directly from VASP. A segmentation fault is always an indication of a problem with the program being run; either a bug in the code or incorrectly compiled/linked binary.

shahid-sattar commented 8 months ago

Hi again, I was able to fix this run-time issue and now z2pack is running on 1 node and writing output files. However, I can see warning message and questioning if the parameters are correct in my input.

I have NBANDS=62 in the DFT-VASP OUTCAR, which I assured to be the same in wannier90.win file and when running z2pack in the build/OUTCAR file.

The code automatically chooses KPOINTS file (strangely only 1 kpoint in x-y direction and many in x direction, 1 1 29 ! subdivisions for example) and goes into loop many times, but sometime write "WARNING: Iterator stopped before the calculation could converge"

I tweaked some parameters, but it seems not affecting. May I ask what is the right strategy here?

Second, I can see that wannier90.amn, .mmn and .eig files are calculated recursively and used. Can I directly used a pre-calculated wannier90_hr.dat file for quick computation?

Thanks in advance, again, for your valuable support. Screenshot from 2024-03-26 15-18-57 Here, I provide input file. Screenshot from 2024-03-26 15-23-10

greschd commented 8 months ago

Hi,

I have NBANDS=62 in the DFT-VASP OUTCAR, which I assured to be the same in wannier90.win file and when running z2pack in the build/OUTCAR file.

I suspect something is wrong with the input: Unless the system is very problematic, the convergence should not take this many steps. A common cause for this with VASP is that the number of bands is automatically increased to a multiple of the number of cores used in the calculation. The bands ignored by Wannier90 needs to be adjusted to match this.

The code automatically chooses KPOINTS file (strangely only 1 kpoint in x-y direction and many in x direction, 1 1 29 ! subdivisions for example)

This is intentional: Z2Pack performs convergence both in x-direction (increasing the number of points in a line) and then y-direction (increasing the number of lines). This is because the Berry flux can be concentrated in a small k-space. Without convergence, the results are rarely reliable. As such, it's also not possible to use a single pre-computed k-point grid.