qcxms / QCxMS

Quantum mechanic mass spectrometry calculation program
https://xtb-docs.readthedocs.io/en/latest/qcxms_doc/qcxms.html
GNU Lesser General Public License v3.0
35 stars 21 forks source link

Initial setup scripts #1

Open tobigithub opened 3 years ago

tobigithub commented 3 years ago

Hi, congratulations on the initial release. Very exciting stuff to have everything automated. The initial release just came out 3 days ago, so I guess this is all very fresh. For users that are not familiar with the program it would be nice to have two additional scripts in the release.

1) It would be nice to have an .envrc file and a source script to export the appropriate variables. 2) It would be nice to setup the appropriate OMP variables for parallel execution.

For example without any restrictions, the program just uses all threads, instead of CPU cores. In my case I have a complete over-saturation (I believe). So the OMP_NUM_THREADS could be also in the setup script, but as number of cores, not threads.

.envrc (just an example from my environment, not universal)

export XTBHOME=/home/ubuntu/QCxMS.v5.0/.XTBPARAM
export PATH=$PATH:/home/ubuntu/QCxMS.v5.0
export OMP_NUM_THREADS=112
ulimit -s unlimited

and the command to source the .envrc (from my environment, not universal)

source /home/ubuntu/QCxMS.v5.0/.envrc

Also when the OMP_NUM_THREADS variable is set on a NUMA node (not multiple cluster nodes) the pqcxms calling option has to be set to one (pqcxms 1) otherwise there is an over subscription of threads.

Best Tobias

image

tobigithub commented 3 years ago

Actually for small molecules or similar its better to just run export OMP_NUM_THREADS=2 I did the test on another machine with 48 true CPUs (96 threads) and for these, while always being exactly at the true CPU core count of 48 and export OMP_NUM_THREADS=2 was the fastest. Again just a micro benchmark, also it may vary when using other platforms. For PBS and SLURM or TORQ its probably similar, 2-4 OMP_NUM_THREADS will be the best.

export OMP_NUM_THREADS=1
time pqcxms 48
real    4m15.585s
user    44m26.680s
sys     0m42.070s

export OMP_NUM_THREADS=2
time pqcxms 24
real    2m57.483s
user    63m39.573s
sys     0m46.494s

export OMP_NUM_THREADS=4
time pqcxms 12
real    3m55.269s
user    253m8.834s
sys     5m6.752s

export OMP_NUM_THREADS=8
time pqcxms 6
real    8m27.914s
user    358m16.833s
sys     2m13.541s