check if kv2max needs to be that big

g-raffy commented 3 years ago

in himain.F, kv2max is using a magic number 50, and I suspect that this number is way too big:

kv2max = 50 * kmax * kmax

If kmax is set to 5000, then kv2max has the value 1.25e9, which makes the array cov2 use 81.25e9=10 Gb, and coiv2 use 41.25e9=5 Gb

I suspect that cov2 and coiv2 are oversized because the virtual memory used is actually much smaller than 15 Gb when all 5000 channels are used (see @demessh's job 128642.4, which used less than 2 Gb of maxvmem while using more than 5000 channels)

If we can drasaticlly reduce this value, then this would greatly benefit to the cluster, as currently some cores are wasted du to memory shortage...

g-raffy commented 3 years ago

Yester I tried to find the lowest multiplier such that all tests pass and it turns out to be 4 :

kv2max = 4 * kmax * kmax

demessh commented 3 years ago

I did some test calculations for the para-H3O+ -- ortho-H2 collisional system to check the size of the "memory leak", i.e. the difference between the requested memory (minimal virtual memory to request in order to complete the calculation without errors) and the actually used virtual memory (vmem) in the calculations. In test calculations the HIBRIDON code was used with the initial array

kv2max = 50 * kmax * kmax

My test results in numbers are as follows:

if the calculation requires 1110 channels (kmax) then the the requested memory should be >= 2.6 GB, while the maximal virtual memory actually used during the calculations is about 112 MB (the difference is about a multiple of 23)*
if the calculation requires 5500 channels (kmax) then the the requested memory should be >= 24 GB, while the maximal virtual memory actually used during the calculations is about 1.730 GB (the difference is about a multiple of 14).

Both results support that, in principle, if the kv2max coefficient is decreased from 50 by a factor of 10, so to be equal 5, the calculations still could be performed successfully.

I would ask Paul Dagdigian @pdagdigian, what is his opinion about this issue. He understand the best the scientific background behind the code details. Maybe you can tell us, Paul, why a value of 50 was set as a constant coefficient for kv2max?

Just as a reminder, in the himain.F source file the following comments are present:

* kv2max sets the maximum size of the v2 matrix, a reasonable size is
* kmax**2
* increase kv2max from 2 * kmax * kmax to 20 * kmax * kmax (pjd - 13-dec-2019)

pdagdigian commented 3 years ago

Thanks for your question. You have checked inot memory size more than I have.

You should feel free to change the parameters sizes to fit the calculation. Note that these changes should be made in the file himain.t, NOT himain.f. The latter is recreated from himain.t during the makehib operation.

Below I give the lines of code from human.t where the parameters are set:

increase size of jout array from 21 to 25 - p.dagdigian - 23-jun-2010
increase size of jout array from 25 to 50 - p.dagdigian - 19-sep-2012 parameter (klammx = 2000, kfact = 3500, kout = 50, ken = 25) parameter (kmxpho = 3, knphot = 1) parameter (kmxbas = 30) parameter (kq2 = 2 kmax, kqmax = kmxpho kmax)
kv2max sets the maximum size of the v2 matrix, a reasonable size is
kmax**2
increase kv2max from 2 kmax kmax to 20 kmax kmax (pjd - 13-dec-2019)
parameter (kv2max = 2 kmax kmax) parameter (kv2max = 50 kmax kmax)
krotmx set the maximum size of the asym top e.fn expansion
increased factor multiplying kmax below and narray from 12 to 100 (pjd: 19-sep-2017) parameter (krotmx = 500 * kmax)

On Oct 4, 2021, at 10:41 AM, demessh @.**@.>> wrote:

  External Email - Use Caution

I did some test calculations for the para-H3O+ -- ortho-H2 collisional system to check the size of the "memory leak", i.e. the difference between the requested memory (minimal virtual memory to request in order to complete the calculation without errors) and the actually used virtual memory (vmem) in the calculations. In test calculations the HIBRIDON code was used with the initial array

kv2max = 50 kmax kmax

My test results in numbers are as follows:

if the calculation requires 1110 channels (kmax) then the the requested memory should be >= 2.6 GB, while the maximal virtual memory actually used during the calculations is about 112 MB (the difference is about a multiple of 23)*
if the calculation requires 5500 channels (kmax) then the the requested memory should be >= 24 GB, while the maximal virtual memory actually used during the calculations is about 1.730 GB (the difference is about a multiple of 14).

Both results support that, in principle, if the kv2max coefficient is decreased from 50 by a factor of 10, so to be equal 5, the calculations still could be performed successfully.

I would ask Paul Dagdigian @pdagdigianhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpdagdigian&data=04%7C01%7Cpjdagdigian%40jhu.edu%7C56f00be6b8444c9914ca08d987451979%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637689553115998997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vHq3dhbINm5jZjWiYyhnkdHG53qER8k7CtopoZLW46s%3D&reserved=0, what is his opinion about this issue. He understand the best the scientific background behind the code details. Maybe you can tell us, Paul, why a value of 50 was set as a constant coefficient for kv2max?

Just as a reminder, in the himain.F source file the following comments are present:

kv2max sets the maximum size of the v2 matrix, a reasonable size is
kmax**2
increase kv2max from 2 kmax kmax to 20 kmax kmax (pjd - 13-dec-2019)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhibridon%2Fhibridon%2Fissues%2F51%23issuecomment-933554742&data=04%7C01%7Cpjdagdigian%40jhu.edu%7C56f00be6b8444c9914ca08d987451979%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637689553116008952%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0NP86gDd7jkEIH6nJWGTErnLxg51j8ac4KYCJgQxvnk%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FANL4IEWYG3L5QH65J5472K3UFG4KZANCNFSM5FCCOLXQ&data=04%7C01%7Cpjdagdigian%40jhu.edu%7C56f00be6b8444c9914ca08d987451979%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637689553116008952%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OfmxgMWOEKof8QQkF0VC1pSGOgEn5Ku2JcExPOzxbX0%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cpjdagdigian%40jhu.edu%7C56f00be6b8444c9914ca08d987451979%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637689553116018910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=wvauBWCoq%2F3XWsG7maEKGvP2aqdXCoZ%2BJ0mXedgh%2B8E%3D&reserved=0 or Androidhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cpjdagdigian%40jhu.edu%7C56f00be6b8444c9914ca08d987451979%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637689553116018910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=W55nHM5B9ryrwRnuoFUgo3JqEreg%2FwFwKc6jwZMDsXY%3D&reserved=0.

Paul J. Dagdigian Academy Professor Professor Emeritus Department of Chemistry, The Johns Hopkins University Baltimore, MD 21218

Tel: 1-410-516-7438; FAX: 1-410-516-8420 http://chemistry.jhu.edu/directory/paul-dagdigian/

g-raffy commented 2 years ago

I'm back on this issue because it will probably fix issue #87, which is blocking for @AmelieGodard and @MichMich098

So, this afternoon I merged the master, and the 2 basis that currently work with a growable v2 array (12 and 26) passed the tests... pfew!

Tomorrow, I'll try to migrate basis 30 to use the growable v2, as it's the basis that @AmelieGodard needs.

ppfn commented 2 years ago

Maybe the tests passed before making use of the « growth » of this array ? The basis in the tests are usually very small, is it possible to check for large bases ?

g-raffy commented 2 years ago

Maybe the tests passed before making use of the « growth » of this array ?

@ppfn hello! I'm not sure what you mean here. To make things clearer, the branch dealing with issue #51 only passes tests with bases 12 an 26 because only these 2 have currently been converted to growable v2. The others are currently broken because hibridon's code expects bases to work with growable v2

The basis in the tests are usually very small, is it possible to check for large bases ?

Of course, the plan is to test it on the case of issue #87, which uses base 30. Not sure that adding a test for big systems as it would mean that this test would fail on small machines.. unless we add this test to a special test suite...

g-raffy commented 2 years ago

The commit 8e1e496 fixes this issue but I still need to see how these changes affects overall performance (early tests show a slowdown of about 5% at the moment).

g-raffy commented 2 years ago

just added more work related to this issue: 31318d6

g-raffy commented 2 years ago

commit dbb91923b7c7ab3848a1957ea0424821abd1208e fixes this issue. The storage for angular coupling matrices v2 is now growable. No need for kv2max anymore.

AmelieGodard commented 2 years ago

Bonjour Guillaume,

Sounds very nice, I can not wait to try it!

Amélie Godard

Institut de Physique de Rennes UMR 6251 - CNRS/Université de Rennes 1 Bât. 11B, Campus de Beaulieu 263 avenue du Général Leclerc 35042 Rennes CEDEX, France Phone: +33(0)6 04 03 01 41

https://collexism.com https://www.researchgate.net/profile/Amelie-Godard-Palluet

Le 30 juin 2022 à 00:29, g-raffy @.***> a écrit :

commit dbb9192 https://github.com/hibridon/hibridon/commit/dbb91923b7c7ab3848a1957ea0424821abd1208e fixes this issue. The storage for angular coupling matrices v2 is now growable. No need for kv2max anymore.

— Reply to this email directly, view it on GitHub https://github.com/hibridon/hibridon/issues/51#issuecomment-1170558307, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYFSWDEVT3HZQHRH5GSEDKDVRTE5VANCNFSM5FCCOLXQ. You are receiving this because you were mentioned.

demessh commented 2 years ago

commit dbb9192 fixes this issue. The storage for angular coupling matrices v2 is now growable. No need for kv2max anymore.

Thank you! I will make some benchmarks and tests on my systems soon.

hibridon / hibridon

check if kv2max needs to be that big #51