Closed O2-AC closed 6 months ago
Dear Ole,
how big is the system that you calculate (number of atoms per image and number of images)? Are you running calculations in parallel with cluster: true
? Did you do a diff of the xtb-log files from, lets say, the first image in the first calculation to see if there is a difference beside the different number of threads?
The problem can be reproduced with your Diels-Alder example.
System size: 12 images, 16 atoms
I have not set cluster: true
.
Upon closer inspection, this might be a problem which originates from xTB.
Below the shortened outputs of the command:
xtb diels_alder_educt.xyz --chrg 0 --uhf 0 --acc 1.0 --gfn 2 --grad
with
diels_alder_educt.xyz
:
16
xyz file from https://github.com/ZimmermanGroup/pyGSM, MIT License
C -1.06001665 -1.51714564 0.05288674
C -1.82955412 -0.59408623 -0.53968755
C -2.01260392 0.79370866 -0.08977969
C -1.09740592 1.54095108 0.54110413
H -2.40063347 -0.88235561 -1.42321617
H -0.51365172 -1.30383154 0.96828855
H -0.96688964 -2.52122005 -0.35117707
H -1.32088987 2.55628492 0.85667305
H -0.09454533 1.17390089 0.74481476
H -2.98142355 1.23828062 -0.32070817
C 3.01841440 -0.33274049 0.53420511
C 2.48267950 0.16990394 -0.57660955
H 3.89171154 0.11254122 1.00536203
H 2.60849064 -1.21591676 1.01902222
H 1.60806045 -0.27640639 -1.04373753
H 2.89526366 1.05096138 -1.06301386
1 thread:
-------------------------------------------------
| Calculation Setup |
-------------------------------------------------
program call : xtb diels_alder_educt.xyz --chrg 0 --uhf 0 --acc 1.0 --gfn 2 --grad
hostname : [...]
coordinate file : diels_alder_educt.xyz
omp threads : 1
<...>
-------------------------------------------------
| TOTAL ENERGY -17.821111237146 Eh |
| GRADIENT NORM 0.032546379825 Eh/α |
| HOMO-LUMO GAP 3.849049936451 eV |
-------------------------------------------------
------------------------------------------------------------------------
* finished run on 2024/03/30 at 10:19:09.727
------------------------------------------------------------------------
total:
* wall-time: 0 d, 0 h, 0 min, 0.028 sec
* cpu-time: 0 d, 0 h, 0 min, 0.019 sec
* ratio c/w: 0.691 speedup
SCF:
normal termination of xtb
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
* wall-time: 0 d, 0 h, 0 min, 0.006 sec
* cpu-time: 0 d, 0 h, 0 min, 0.006 sec
* ratio c/w: 0.994 speedup
8 threads:
<...>
* started run on 2024/03/30 at 10:23:13.686
-------------------------------------------------
| Calculation Setup |
-------------------------------------------------
program call : xtb diels_alder_educt.xyz --chrg 0 --uhf 0 --acc 1.0 --gfn 2 --grad
hostname : [...]
coordinate file : diels_alder_educt.xyz
omp threads : 8
<...>
-------------------------------------------------
| TOTAL ENERGY -17.821111237146 Eh |
| GRADIENT NORM 0.032546379512 Eh/α |
| HOMO-LUMO GAP 3.849049942210 eV |
-------------------------------------------------
normal termination of xtb
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
------------------------------------------------------------------------
* finished run on 2024/03/30 at 10:23:14.425
------------------------------------------------------------------------
total:
* wall-time: 0 d, 0 h, 0 min, 0.740 sec
* cpu-time: 0 d, 0 h, 0 min, 5.093 sec
* ratio c/w: 6.886 speedup
SCF:
* wall-time: 0 d, 0 h, 0 min, 0.625 sec
* cpu-time: 0 d, 0 h, 0 min, 4.328 sec
* ratio c/w: 6.921 speedup
I am closing this myself. The underlying issue is caused by wrongly compiled versions of xtb
of our HPC cluster. Using precompiled xtb
versions as provided by the Grimme group, shows the expected behavior.
Describe the bug Increasing
pal
leads to unexpected slower calculation cycles.To Reproduce Running a NEB calculation with
pal: 1
gives shorters/cycle
times than running the same calculation with thepal: 8
setting. In both cases the environment variablesOMP_NUM_THREADS
andMKL_NUM_THREADS
are both set to 1 and 8, respectively.I also just tried it with your Diels-Alder example, when setting the ENV vars both to 8, and also
pal: 8
, the calculation is very slow. Changingpal
to 1, while not touching the ENV variables, speeds up the calculation nearly by an order of magnitude.Expected behavior The calculation cycles should get faster when increasing either
pal
or the environment variables.OS and Python:
Pysisyphus version Current dev branch, installation from source.