Closed luisgithub269 closed 5 years ago
Hi, everyone
I need your help plis
The main reason why wrote this mesagge is because i need run a simulation at 500 files each one 5hrs and the resources assigns are 4 nodes, and each node contains 24 cores which allows me to execute 8 tasks simultaneously (two jobs per node, each jobs assign=12 ntasks ) reducing the execution time to 6 days. by
The second reason is the next week I have to deliver my thesis to be able to graduate at chemical engineering at the universidad industrial de santander, colombia
excuse me my writing, I know little about english
The programs used are: orca_4_0_1_2_linux_x86-64_openmpi202.tar.xz openmpi-2.0.2.tar.gz LAUNCHER-TACC version github
I have the next problem with LAUNCHER
the main problen is how assign the resources what the launcher (SCRIPTLAUNCHER.SH) access to transmit a the file (EJECUTAR1.SH) by assign file input ORCA, for fix solution the SLURM resource allocator expects to find the following environment variables:
SLURM_NODELIST
SLURM_TASKS_PER_NODE
However, it was unable to find the following environment variable:
SLURM_NODELIST
when i`m run the program show the next message
the output at simulation at ORCA (QUANTUNN CHEMICAL)
While trying to determine what resources are available, the SLURM resource allocator expects to find the following environment variables:
SLURM_NODELIST
SLURM_TASKS_PER_NODE
However, it was unable to find the following environment variable:
SLURM_NODELIST
[file orca_main/gtoint.cpp, line 137]: ORCA finished by error termination in OR$
the input at ejecutar1.sh
export PATH=/usr/local/openmpi-2.0.2/bin:$PATH export LD_LIBRARY_PATH=/usr/local/openmpi-2.0.2/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/openmpi-2.0.2/openmpi:$LD_LIBRARY_PATH
export ORCA_PATH=/home/luis_sc3/plantatrabajo/orca4012 export PATH=$ORCA_PATH:$PATH
export FILE_PATH=/scratch/luis_sc3orca/ligandos1/ligandos/l1
$ORCA_PATH/orca $FILE_PATH/l1.inp > $FILE_PATH/l1.out
the input at jobfile=ejecucion ./ejecutar1.sh ./ejecutar2.sh ./ejecutar3.sh
the input at simulation at ORCA (QUANTUNN CHEMICAL)
! Opt B3LYP def2-SV(P) KDIIS SOSCF
%pal nprocs 12 end
%MaxCore 8000
xyz 0 1
C -0.009689 0.061838 0.577205
C 1.385471 0.061838 0.577205
C 2.083009 1.269589 0.577205
C 1.385355 2.478098 0.576006
C -0.00947 2.47802 0.575527
C -0.707071 1.269814 0.576523
H -0.559448 -0.890479 0.577655
H 1.934979 -0.890675 0.57852
H 1.935555 3.430241 0.575947
H -0.559592 3.430301 0.574582
C 3.623008 1.269701 0.578093
H 4.156068 2.195284 0.641675
C 4.305383 0.101533 0.498621
H 3.783961 -0.83143 0.549614
C 5.83652 0.116891 0.334358
H 6.19942 -0.888179 0.279245
O 6.27975 0.804837 -0.882982
C -2.247071 1.27007 0.576271
H -2.780381 0.343817 0.525923
C -2.929131 2.439409 0.639598
H -2.419926 3.368733 0.491385
C -4.442266 2.43064 0.92586
O -5.081905 3.512678 0.986065
O -5.122792 1.188646 1.123939
H -6.06831 1.346171 1.176669
O 6.429622 0.777015 1.455682
Zn 7.777071 0.705579 -0.731688
Zn 7.870765 0.852637 0.758143
in the ouput at launcher
Launcher: Setup complete.
------------- SUMMARY --------------- Number of hosts: 2 Working directory: /scratch/luis_sc3orca/ligandos1/ligandos Processes per host: 2 Total processes: 4 Total jobs: 3 Scheduling method: dynamic
Launcher: Starting parallel tasks... Launcher: Task 0 running job 1 on guane14 (./ejecutar1.sh) Launcher: Task 1 running job 2 on guane14 (#./ejecutar2.sh) Launcher: Job 2 completed in 0 seconds. Launcher: Task 1 running job 3 on guane14 (#./ejecutar3.sh) Launcher: Job 3 completed in 0 seconds. Launcher: Task 1 done. Exiting. [guane14:21063] [[16989,0],0] ORTE_ERROR_LOG: Not found in file base/ras_base_a$ [file orca_main/gtoint.cpp, line 137]: ORCA finished by error termination in OR$
Launcher: Job 1 completed in 0 seconds. Launcher: Task 0 done. Exiting. localhost [127.0.0.1] 9471 (?) : Connection refused localhost [127.0.0.1] 9471 (?) : Connection refused WARNING: No response from dynamic task server. Retrying... localhost [127.0.0.1] 9471 (?) : Connection refused localhost [127.0.0.1] 9471 (?) : Connection refused WARNING: No response from dynamic task server. Retrying... .. . . . .
input at launcher
export LAUNCHER_DIR=/home/luis_sc3/plantatrabajo/launcher #pwd direcccion de la$ export PATH=$LAUNCHER_DIR:$PATH export PATH=/usr/bin/python2.7:$PATH export PATH=/usr/lib/python2.7:$PATH
export LAUNCHER_PPN=2 # de trabajos por nodo export LAUNCHER_PLUGIN_DIR=$LAUNCHER_DIR/plugins #administrador de trabajos export LAUNCHER_RMI=SLURM
export LAUNCHER_WORKDIR=$PWD #pwd_of_jobfile export LAUNCHER_JOB_FILE=ejecucion #jobfile
export LAUNCHER_SCHED=dynamic
$LAUNCHER_DIR/paramrun
I'm going to close this one since it's a duplicate of #51. @luisgithub269 if I'm mistaken please feel free to let me know and I'll re-open it.
Hi, everyone
I have the next problem.
----------------------------
While trying to determine what resources are available, the SLURM resource allocator expects to find the following environment variables:
However, it was unable to find the following environment variable:
[file orca_main/gtoint.cpp, line 137]: ORCA finished by error termination in ORCA_GTOInt
Launcher: Setup complete.
------------- SUMMARY --------------- Number of hosts: 2 Working directory: /scratch/luis_sc3orca/ligandos1/ligandos Processes per host: 2 Total processes: 4 Total jobs: 3 Scheduling method: dynamic
Launcher: Starting parallel tasks... Launcher: Task 0 running job 1 on guane14 (srun -N 1 -n 12 ./ejecutar1.sh) Launcher: Task 1 running job 2 on guane14 (# ./ejecutar2.sh) Launcher: Job 2 completed in 0 seconds. Launcher: Task 1 running job 3 on guane14 (# ./ejecutar3.sh) srun: error: Unable to create step for job 114446: More processors requested than permitted Launcher: Job 3 completed in 0 seconds. Launcher: Job 1 completed in 0 seconds. Launcher: Task 1 done. Exiting. Launcher: Task 0 done. Exiting. localhost [127.0.0.1] 9471 (?) : Connection refused /home/luis_sc3/plantatrabajo/launcher/launcher: line 82: [: -gt: unary operator expected localhost [127.0.0.1] 9471 (?) : Connection refused /home/luis_sc3/plantatrabajo/launcher/launcher: line 82: [: -gt: unary operator expected localhost [127.0.0.1] 9471 (?)localhost [127.0.0.1] 9471 (?) : Connection refused : Connection refused
the scritp master
!/bin/bash
--------SCHEDULER OPTIONS-------
SBATCH -J parametric
SBATCH -N 2
SBATCH -n 4
SBATCH -p manycores24
SBATCH -w=guane[09-11]
SBATCH -o parametric.o%j
--------GENERAL OPTIONS---------
export LAUNCHER_DIR=/home/luis_sc3/plantatrabajo/launcher #pwd direcccion de launcher export PATH=$LAUNCHER_DIR:$PATH export PATH=/usr/bin/python2.7:$PATH export PATH=/usr/lib/python2.7:$PATH
export LAUNCHER_PLUGIN_DIR=$LAUNCHER_DIR/plugins #administrador de trabajos
export LAUNCHER_PPN=2 # de trabajos por nodo export LAUNCHER_PLUGIN_DIR=$LAUNCHER_DIR/plugins #administrador de trabajos export LAUNCHER_RMI=SLURM export EXECUTABLE=$LAUNCHER_DIR/init_laucher export LAUNCHER_WORKDIR=$PWD #pwd_of_jobfile export LAUNCHER_JOB_FILE=ejecucion #jobfile
--------TASKS SCHEDULING OPTIONS
export LAUNCHER_SCHED=dynamic
--------EXECUTING---------------
$LAUNCHER_DIR/paramrun
the script at jobfile ./ejecutar1.sh ./ejecutar2.sh ./ejecutar3.sh
the script at ejecutar1.sh
!/bin/bash
Grupo de nodos a utilizar
SBATCH --partition=manycores24
nombre del trabajo
SBATCH -J l1
nombre del archivo de salida
SBATCH -o l1.%j.out
numero de nodos
SBATCH --nodes=1
numero de tareas
SBATCH --ntasks=12
numero de tasks por nodo
SBATCH --tasks-per-node=12
numero de tasks por cpu
SBATCH --cpus-per-task=1
tiempo habilitado para la ejeucucion
SBATCH --time=120:00:00
memoria asignada para cada cpu
SBATCH --mem-per-cpu=8G
solicitud de uso de memoria sin limites
ulimit -l unlimited
variables de openmpi-2.0.2
export PATH=/usr/local/openmpi-2.0.2/bin:$PATH export LD_LIBRARY_PATH=/usr/local/openmpi-2.0.2/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/openmpi-2.0.2/openmpi:$LD_LIBRARY_PATH
variable para llamar a orca
export ORCA_PATH=/home/luis_sc3/plantatrabajo/orca4012 export PATH=$ORCA_PATH:$PATH
variable para llamar direccion del archivo
export FILE_PATH=/scratch/luis_sc3orca/ligandos1/ligandos/l1
$ORCA_PATH/orca $FILE_PATH/l1.inp > $FILE_PATH/l1.out
Any ideas how to fix the problem?