Closed hernan3009 closed 5 years ago
Run gstack or equivalent to get a backtrace. Probably just really expensive calculation taking a long time. Please also provide input file to reproduce. -- Jeff Hammond jeff.science@gmail.com http://jeffhammond.github.io/
Thank you for your answer. The input file is https://www.pastefs.com/pid/159041
I tried strace (is it equivalent to gstack?). I do not understand its continuous output. I ran it and stop it with CTRL + C. The output is https://www.pastefs.com/pid/159042 .
IO EAF puts everything on disk. This is super slow. Any chance you can fit this calculation into memory?
Try this... memory stack 22000 mb heap 100 mb global 22000 mb noverify geometry units angstrom symmetry C2V CL 0.0000000000 0.0000000000 -0.3666912916 F 0.0000000000 -1.7010030671 -0.2815117930 F 0.0000000000 1.7010030671 -0.2815117930 F 0.0000000000 0.0000000000 1.2379631400 end basis small
Thank you. I bet I cannot fit everything in memory. I tried before but I get out of memory. Although I did not specify stack heap and global previously. Can it make difference? (I am performing the calculation in my home PC)
GA and stack+heap are separate. The latter is static up-front allocation. GA is dynamic ala malloc and you can set it to a big number and it will only segfault if it really needs all of it.
I’ll see if I can run this on one of my machines with 1.5 TB.
I really appreciate your help. Do you think that I should stop this calculation?
I will look at the source later and see if I can determine how far along it is. Don’t kill it now if you don’t need your computer to do anything else.
@hernan3009 To second @jeffhammond suggestion to avoid memory, you might want to decrease the tilesize to 16 to reduce memory usage
@edoapra , thank you. Before writing data to disk I tried the default behavior. I reduced the tilesize
. I do not remember the smaller value I tried, I am pretty sure that it was not so small as 2 or 3, but surely I tried 8 and 10. I did not set the attilesize
parameter. I did not added
memory stack 22000 mb heap 100 mb global 22000 mb noverify
but only specified the total memory.
Does it make sense to use tilesize
very small (let say 1, 2 or 4)?
In any case there is something that I do not understand. I am running this calculation in two cores. I understand that a hard disk is slower than the RAM. Intuitively, I tend to think that there is a waste of time when writing/reading from disk, and that when the reading (from disk) is slow the processors could process more data per time unit that the amount of data per time unit that they receive. But in the actual state of my calculation I don't see that data is been written to disk and the two processors are at 100% of usage (according to htop). So, I suspect that in the actual state of the calculation the io EAF
should not be slowing down the calculation. Does it make sense?
PD: It is still runing
Tilesize less than 8 doesn’t really make sense.
@hernan3009 Your input seems to have tilesize 19. I would go for 16 or 12
@edoapra Yes. That is the titlesize chosen by NWChem when I did not set it explicitly. I set it for previous runs when I used io ga (to smaller values). I thought that large titlesize improve performance. As I had not RAM issues with IO EAF I just let NWChem choose it. Is it a bad practice?
@jeffhammond Could you reproduce the behavior that I am experiencing?
I am working on it. My big machine is offline but I'm trying another now.
Got calculation to complete using 32 nodes and four processes/node and 4 threads. Tilesize=16. It took 30000 seconds
9 0.0003265546817 617.3 582.3
10 0.0003432806979 613.9 578.3
MICROCYCLE DIIS UPDATE: 10 5
11 0.0000173152244 613.1 577.6
12 0.0000093283269 613.9 578.7
13 0.0000040227335 610.0 574.8
14 0.0000028508313 610.0 575.0
15 0.0000013125364 624.5 589.9
MICROCYCLE DIIS UPDATE: 15 5
16 0.0000001424974 627.0 592.5
17 0.0000000930854 615.2 580.2
---------------------------------------------
Iterations converged
CCSDT dipole moments / hartree & Debye
------------------------------------
X -0.0000000 -0.0000000
Y -0.0000000 -0.0000000
Z 0.2683693 0.6821318
Total 0.2683693 0.6821318
------------------------------------
CCSDT(2)_Q correlation energy / hartree = -1.040101618865013
CCSDT(2)_Q total energy / hartree = -758.715217414449967
Cpu & wall time / sec 12947.4 12889.8
Task times cpu: 30441.2s wall: 29312.4s
Thank you so much. It seems that NWChem works perfectly and it was just lack of computational power.
@edoapra I noticed that the dipole moments from your calculation differs from mine. Did you use the same input file? Could you share the complete output for comparison?
I have uses a spherical basis set (at least in the first part of the input) Here is the full input file. hern2.nw.txt hern2.out.txt
@edoapra thanks. I did not notice the existence of the spherical keyword.
Just a doubt, does the lines before TCE in the input file revert to Cartesian? I mean:
`basis
F library def2-tzvp
Cl library def2-tzvp
end
`
That is, the SCF part was computed with spherical and the TCE with cartesian. Am I right?
Yes to both of your questions. See https://github.com/nwchemgit/nwchem/wiki/Basis
@edapra and @jeffhammond : Thank you very much.
I got my calculation CCSDT(2)_Q stuck since about 2 days. I get not error messages. I do not know if it performing a very costly step or if it just failed. The processors are busy and sometimes the SDD seems to be working, but files in the working directory are untouched since 30 h ago. Is it a normal behavior or there a problem? Thank you.
And the files present in the directory are:```
-rw-r--r-- 1 hernan hernan 154435408 2019-08-26 14:09:27.210613602 -0300 input.ccsdt2_q_left_4_1_i1 -rw-r--r-- 1 hernan hernan 1946552 2019-08-26 14:09:26.754615776 -0300 input.ccsdt2_q_left_3_1_i1 -rw-r--r-- 1 hernan hernan 4312 2019-08-26 14:09:26.710615986 -0300 input.ccsdt2_q_left_2_1_i1 -rw-r--r-- 1 hernan hernan 64450253616 2019-08-26 14:09:26.706616005 -0300 input.ccsdt2_q_right_6_1_i1 -rw-r--r-- 1 hernan hernan 340420224 2019-08-26 13:59:30.121421802 -0300 input.ccsdt2_q_right_5_1_i1 -rw-r--r-- 1 hernan hernan 215069513856 2019-08-26 13:58:54.741584521 -0300 input.ccsdt2_q_right_4_1_i1 -rw-r--r-- 1 hernan hernan 2892055216 2019-08-26 12:26:05.565007570 -0300 input.ccsdt2_q_right_3_1_i1 -rw-r--r-- 1 hernan hernan 154435408 2019-08-26 12:20:18.279533476 -0300 input.ccsdt2_q_right_2_1_i1 -rw-r--r-- 1 hernan hernan 1946552 2019-08-26 12:19:36.199837795 -0300 input.ccsdt2_q_right_1_1_i1 -rw-r--r-- 1 hernan hernan 8 2019-08-26 12:19:21.971940581 -0300 input.e -rw-rw-r-- 1 hernan hernan 64747 2019-08-26 12:19:21.971940581 -0300 salida.out -rw-r--r-- 1 hernan hernan 9708342080 2019-08-26 10:45:16.369865154 -0300 input.lambda3 -rw-r--r-- 1 hernan hernan 9344784 2019-08-26 10:45:03.481909838 -0300 input.lambda2 -rw-r--r-- 1 hernan hernan 4312 2019-08-26 10:45:03.473909865 -0300 input.lambda1 -rw-r--r-- 1 hernan hernan 9708342080 2019-08-25 10:02:27.431955038 -0300 input.t3 -rw-r--r-- 1 hernan hernan 9344784 2019-08-25 10:01:34.228311196 -0300 input.t2 -rw-r--r-- 1 hernan hernan 4312 2019-08-25 10:01:34.188311464 -0300 input.t1 -rw-r--r-- 1 hernan hernan 1095788360 2019-08-24 14:54:53.383509792 -0300 input.v2 -rw-r--r-- 1 hernan hernan 47120 2019-08-24 14:52:09.508643260 -0300 input.f1 -rw-r--r-- 1 hernan hernan 47120 2019-08-24 14:52:08.684648428 -0300 input.d1z -rw-r--r-- 1 hernan hernan 44352 2019-08-24 14:52:08.680648452 -0300 input.d1y -rw-r--r-- 1 hernan hernan 36240 2019-08-24 14:52:08.676648477 -0300 input.d1x -rw-rw-r-- 1 hernan hernan 2247984 2019-08-24 14:52:08.628648778 -0300 input.db -rw-rw-r-- 1 hernan hernan 183920 2019-08-24 14:52:08.620648828 -0300 large.mos -rw-r--r-- 1 hernan hernan 180035 2019-08-24 14:52:08.548649280 -0300 input.cfock -rw-rw-r-- 1 hernan hernan 34624 2019-08-24 14:52:05.524668244 -0300 small.mos -rw-rw-r-- 1 hernan hernan 696 2019-08-24 14:52:05.288669725 -0300 input.b -rw-rw-r-- 1 hernan hernan 696 2019-08-24 14:52:05.288669725 -0300 input.b^-1 -rw-rw-r-- 1 hernan hernan 416 2019-08-24 14:52:05.288669725 -0300 input.c -rw-rw-r-- 1 hernan hernan 416 2019-08-24 14:52:05.288669725 -0300 input.p -rw-rw-r-- 1 hernan hernan 80 2019-08-24 14:52:05.288669725 -0300 input.zmat -rw-rw-r-- 1 hernan hernan 1023 2019-08-24 14:51:13.528994692 -0300 input.inp