msh-yi commented 4 years ago

Hello enso team,

I am trying to use enso and ORCA to refine a crest ensemble at the DFT level (part 1, 2, 3). Unfortunately the procedure fails in part 1, where for each calculation, I get an ORCA error:

--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 16 slots
that were requested by the application:
  /gpfs/loomis/apps/avx/software/ORCA/4.2.1-OpenMPI-2.1.2/orca_gtoint_mpi

Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------

ORCA finished by error termination in GTOInt
Calling Command: mpirun -np 16  /gpfs/loomis/apps/avx/software/ORCA/4.2.1-OpenMPI-2.1.2/orca_gtoint_mpi inp.int.tmp inp 
[file orca_tools/qcmsg.cpp, line 458]: 
  .... aborting the run

[file orca_tools/qcmsg.cpp, line 458]: 
  .... aborting the run

I was able to get the same error simply by running one instance of xtb as a driver for ORCA: xtb coord --opt crude --orca even though I allocated the correct number of processors to xtb. As I have encountered this error when setting up ORCA to be run in parallel, I suspect that this error is related to how xtb is calling and passing MPI paths to ORCA, or to how I am specifying paths in the submit script.

I've attached the enso output file (enso.out), slurm submit script (slurm_enso.sh) from the enso run. Also attached are the xtb coordinate input file (coord) and ORCA input (inp) for a test case using xtb as an ORCA driver.

Thank you!

Marcus

slurm_enso.sh.txt enso.out.txt inp.txt coord.txt

fabothch commented 4 years ago

Hi @msh-yi ,

from looking at your job script, you are trying to use ENSO and xTB to calculated over several nodes (4 nodes with each 4 cores)

#SBATCH --ntasks=4
#SBATCH --cpus-per-task=16

ENSO and xTB are not designed to work this way! You can use ENSO and xTB only on one node!

First try running an ORCA calcuation using xTB as driver on only one node and see if this resolves your problem.

And in your job script, ENSO will distribute the corrrect number of cores automatically (you don't have to assign 16 cores to xTB). This assumes of course that you set maxthreads and omp correctly in your file flags.dat.

You set:

maxthreads:                                                    4
omp:                                                           16

which means that you run four independent threads with each 16 cores assigned! You then request essentially 4*16 = 64 cores!!!

Best,

fabothch

msh-yi commented 4 years ago

Hi fabothch,

Thanks for looking into this!

I have tried to run an independent instance of xTB as an ORCA driver on only one node, and I get the same error. The files are attached: coord.runxtb.slurm.bash.txt: job script, one node, one task, 8 cores coord.txt: inp.txt coord.runxtb.out.txt: same error here

I performed two ENSO runs with the following settings, both with the same errors as before:

One node, two tasks:


SLURM settings:
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=8

xTB settings: export OMP_NUM_THREADS=8 export MKL_NUM_THREADS=8

ENSO flags: maxthreads = 2 omp = 8

From my understanding, `maxthreads` is the number of simultaneous calculations (e.g. Part 1 optimizations) and is therefore equivalent to Slurm's `ntasks`; `omp` is the number of cores for each calculation, and is therefore equivalent to Slurm's `cpus-per-task`, as well as xTB's `$OMP_NUM_THREADS = $MKL_NUM_THREADS`. 

Files:
[opt-part1.out.txt](https://github.com/grimme-lab/enso/files/4818417/opt-part1.out.txt) (sample orca output from first conformer)
[slurm_enso.sh.txt](https://github.com/grimme-lab/enso/files/4818418/slurm_enso.sh.txt)
[enso.out.txt](https://github.com/grimme-lab/enso/files/4818419/enso.out.txt)
[flags.dat.txt](https://github.com/grimme-lab/enso/files/4818420/flags.dat.txt)

2. One node, one task, two enso threads (in case I have misunderstood `maxthreads` and `omp`):

SLURM settings:

SBATCH --nodes=1

SBATCH --ntasks=1

SBATCH --cpus-per-task=16

xTB settings: export OMP_NUM_THREADS=8 export MKL_NUM_THREADS=8

ENSO flags: maxthreads = 2 omp = 8



Files:
[slurm_enso.sh.txt](https://github.com/grimme-lab/enso/files/4818526/slurm_enso.sh.txt)
[enso.out.txt](https://github.com/grimme-lab/enso/files/4818527/enso.out.txt)
[flags.dat.txt](https://github.com/grimme-lab/enso/files/4818528/flags.dat.txt)

As a side note, I was not aware that ENSO/xTB should not be used across nodes - I was indeed trying to run four 16-core threads, one thread on each node.

Thank you again :)

Marcus

fabothch commented 4 years ago

Hi Marcus,

I would have expected your second approach to work. I have never used slurm since we have a different cluster setup. But ENSO spans its own subprocesses therefore I would have expected your second approach to work.

Does a normal ORCA calculation run on your system on only one node?

best,

fabothch

msh-yi commented 4 years ago

Hi fabothch,

It turned out to be a Slurm issue. I was able to resolve the problem by requesting 16 cores with Slurm's --ntasks=16, not --cpus-per-task. i.e. for Slurm users,

maxthreads * omp = ntasks = maximum number of cores requested over all threads

Thank you for your support!

Marcus

grimme-lab / enso

xtb as driver for parallel ORCA #13

SBATCH --nodes=1

SBATCH --ntasks=1

SBATCH --cpus-per-task=16