Closed lixinyuu closed 4 years ago
I have no idea what the environment variable NCPUs is. But I would suggest you just put 16 if that's the number you are going to use.
@shyuep Thanks, Shyue. NCPUS is the number of CPU requested in PBS. It is 16 as well, and I see the same heading lines in the OUTCAR as below, which indicate this shouldn't be the reason.
vasp.5.4.4.18Apr17-6-g9f103f2a35 (build Mar 18 2020 13:00:55) complex
executed on LinuxIFC date 2020.05.15 21:28:44
running on 16 total cores
distrk: each k-point on 16 cores, 1 groups
distr: one band on NCORES_PER_BAND= 4 cores, 4 groups
In custodian, the VASP is called in https://github.com/materialsproject/custodian/blob/883b491e6c6335a75ca88e2855948ccf4f881696/custodian/vasp/jobs.py#L278 Have you saw any case that subprocess.Popen(cmd) is slower than cmd directly? Thanks.
I am not aware of why subprocess would result in a slow down instead of cmd directly. As far as we have tested it in our systems, this does not seem to be the case.
Thanks Professor. The problem identified as environment variable OMP_NUM_THREADS need to be set as 1 for VASP 5.4. Don't know the detailed mechanism of this but problem has been solved. Thanks.
System
Summary
The univerisity have a migration from Ubuntu or Linux to RedHat, and I met a problem that VASP is slower if I call it from custodian, compared to call it directly by "mpirun -np 16 vasp_std". Is there any possible reasons for this kind of weird behaviour?
Example code
With the same input files, OUTCARs after one hour running:
Custodian - Only the first iteration finished
--------------------------------------- Iteration 1( 1) ---------------------------------------
energy without entropy = -126.53388816 energy(sigma->0) = -126.64889039
--------------------------------------- Iteration 1( 28) ---------------------------------------