ReactionMechanismGenerator / ARC

ARC - Automatic Rate Calculator
https://reactionmechanismgenerator.github.io/ARC/index.html
MIT License
43 stars 21 forks source link

Fix the molpro input file memory #651

Closed alongd closed 1 year ago

alongd commented 1 year ago

Molpro's input file memory is per cpu core, this PR returns this consideration into the calculation. The conversion from mW to GB was done using this website specifying a 64-bit architecture.

A test was added

I snack in two minor TS-related commits into this small PR:

  1. Added a shortcut to specify xyz when initializing an ARCReaction object (in addition to ts_xyz_guess which is less intuitive)
  2. Initialize user TS guesses with the success flag (relates to the method used to generate the guess and whether it gave any guesses, which is always true for a user guess - it has nothing to do with whether the guess is actually reasonable or not)
codecov[bot] commented 1 year ago

Codecov Report

Merging #651 (7369084) into main (9120fee) will increase coverage by 0.01%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
+ Coverage   73.15%   73.17%   +0.01%     
==========================================
  Files          99       99              
  Lines       26402    26414      +12     
  Branches     5534     5534              
==========================================
+ Hits        19315    19329      +14     
+ Misses       5719     5718       -1     
+ Partials     1368     1367       -1     
Flag Coverage Δ
unittests 73.17% <100.00%> (+0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
arc/job/adapters/molpro.py 66.31% <100.00%> (+0.35%) :arrow_up:
arc/job/adapters/molpro_test.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

calvinp0 commented 1 year ago

@alongd

So I tried an input file where it requires a lot of MWords. You can see I gave it 4000MWords to begin but had no -n set in the command line. It eventually converged after a couple of hours no problem

***,DMF_rad_on_methyl
memory,4000,m;
file,1,file1.int    !allocate permanent integral file
file,2,file2.wfu    !allocate permanent wave-function (dump) file

geometry={angstrom;
C      -2.10714600    1.24844900    0.32354700
C      -1.03500900    0.22991700    0.36805500
C      -0.97884000   -1.03439400   -0.43828800
C       0.50608200   -1.26344000   -0.50929300
C       1.13266500   -0.22234600    0.04183100
C       2.56724200    0.09360300    0.23628500
O       0.25451400    0.73290900    0.52090200
H      -3.08293100    0.77492100    0.43101600
H      -1.98761200    1.97675800    1.12579000
H      -2.10875500    1.80017800   -0.62816500
H      -1.52417200   -1.85838400    0.03478600
H      -1.42868400   -0.90773400   -1.43760400
H       0.99137100   -2.11859500   -0.94935400
H       2.82187600    1.03630100   -0.25070100
H       2.79604700    0.19906200    1.29798700
H       3.18729200   -0.69683600   -0.18108200}

basis=cc-pvtz-f12

int;
{hf;
maxit,1000;
wf,spin=1,charge=0;}

uccsd(t)-f12;

---;

Now, I don't know if this was the correct procedure but I divided the 4000 by 16 and so set the Mwords to 250

***,DMF_rad_on_methyl
memory,250,m;
file,1,file1.int    !allocate permanent integral file
file,2,file2.wfu    !allocate permanent wave-function (dump) file

geometry={angstrom;
C      -2.10714600    1.24844900    0.32354700
C      -1.03500900    0.22991700    0.36805500
C      -0.97884000   -1.03439400   -0.43828800
C       0.50608200   -1.26344000   -0.50929300
C       1.13266500   -0.22234600    0.04183100
C       2.56724200    0.09360300    0.23628500
O       0.25451400    0.73290900    0.52090200
H      -3.08293100    0.77492100    0.43101600
H      -1.98761200    1.97675800    1.12579000
H      -2.10875500    1.80017800   -0.62816500
H      -1.52417200   -1.85838400    0.03478600
H      -1.42868400   -0.90773400   -1.43760400
H       0.99137100   -2.11859500   -0.94935400
H       2.82187600    1.03630100   -0.25070100
H       2.79604700    0.19906200    1.29798700
H       3.18729200   -0.69683600   -0.18108200}

basis=cc-pvtz-f12

int;
{hf;
maxit,1000;
wf,spin=1,charge=0;}

uccsd(t)-f12;

---;

In the command I used molpro -n 16 -m 36045000000 -d $MOLPRO_SCRDIR input.in

So I am concerned that this will be an issue again where we start off small with the Mwords and we are having to increase multiple times until we provide the run with enough memory

calvinp0 commented 1 year ago

So until I get real confirmation, I have a theory on the molpro issue and @kfir4444 came to the same conclusion

With Atlas, when we specify -n {cpu_cores}, in the submit file, Molpro actually recognises it and takes it into consideration - thus the total MW we define are multiplied by the number of cores

However, Zeus appears to be different when submitting through PBS...Even if we define -n 16 in the submit file with the command line for Zeus, it will still only be 1 node

Nodes     nprocs
 n027         1

Thus for 4000MW on Zeus it is 4000 MW total whereas on Atlas on it would be 64000MW total due it being 4000 * 16

HOWEVER, if I am on Zeus terminal and do the molpro -n 16 input.in it will show this:

Nodes              nprocs
 zeus   16

So until I can figure out how to define the number of cpus nodes for zeus, we will have to update the PR to check which server we are using if Zeus, dont divide by cpu cors, and if Atlas, divide by cpu cores

calvinp0 commented 1 year ago

I have submitted a ticket to Zeus about this issue we are experiencing

alongd commented 1 year ago

@calvinp0, shall I review, or are we waiting on the Zeus support team?

calvinp0 commented 1 year ago

@calvinp0, shall I review, or are we waiting on the Zeus support team?

I'd say review. The Zeus team have yet to respond to our other issues regarding the submission of jobs. So it may take some time for them to get back to us

alongd commented 1 year ago

Looks overall good. We have two commits to squash (a fixup and the last two), please squash and rebase