Closed mfherbst closed 3 years ago
So should we try to get the stuff in that you already checked above? You could make an extra PR and I can review there?
That's what I'm doing. Will take a few mins to get everything sorted out.
This pull request introduces 2 alerts when merging 9b40e66c70724cc7fb9d8e1c2b3e05bd1721da7f into 60e201cccaa684579dca0ae7eb28232d3adce482 - view on LGTM.com
new alerts:
This pull request introduces 2 alerts when merging c5860e52cd1ecfae5584f5c5bbbb3227cc8dccb2 into af7bdff6dcb4c07ef53f73a492a40915410a455d - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging a7888bd537db3b3161b573f73deac42f847c9a33 into e302303cbfd85bcfadf5d62fc915b997a73e56c9 - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging cde1657611106bffa89375ee341224aa0f8f9915 into d6adad92744a936f015074131b19f19940dae594 - view on LGTM.com
new alerts:
This pull request introduces 3 alerts when merging ef73fac44a0ead2e978f91ed6558c1afb6c3e8df into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 4 alerts when merging 6727ab13da0edace225c629da1b88cf1912b3a97 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
@maxscheurer Now would probably a good time to test this on some bigger cases. I ran a few tests with the compare_implementations.py
script and these are my results (with 2 threads, 5 takes and 30 repeats if you want to reproduce). The time shown is the time per apply. First column is the C++ implementation, second column this new python implementation, next are the absolute and percentage changes relative to C++.
#
#-- water cc-pvtz
#
nocc = 10 nvirt = 106
full adc2 18.97 ms 17.89 ms = -1.07 ms -5.66%
full cvs-adc2 22.24 ms < 27.73 ms = 5.49 ms 24.67%
full adc2x 60.86 ms 60.13 ms = -0.73 ms -1.20%
full cvs-adc2x 58.58 ms < 60.50 ms = 1.92 ms 3.28%
full adc3 79.15 ms 76.12 ms = -3.03 ms -3.83%
full cvs-adc3 149.82 ms 73.89 ms = -75.93 ms -50.68%
#
#-- furane cc-pvdz
#
nocc = 36 nvirt = 144
full adc2 163.60 ms 144.70 ms = -18.90 ms -11.55%
full cvs-adc2 59.69 ms < 60.39 ms = 0.71 ms 1.18%
full adc2x 921.28 ms 900.98 ms = -20.30 ms -2.20%
full cvs-adc2x 293.75 ms < 295.82 ms = 2.07 ms 0.70%
full adc3 1053.22 ms 1015.08 ms = -38.14 ms -3.62%
full cvs-adc3 2730.74 ms 399.03 ms = -2331.71 ms -85.39%
#
#-- ch2nh2 cc-pvtz
#
nocc = 17 nvirt = 215 unrestricted
full adc0 1.02 ms < 1.29 ms = 0.27 ms 26.24%
full cvs-adc0 1.00 ms < 1.24 ms = 0.24 ms 23.72%
full adc1 10.23 ms < 11.11 ms = 0.88 ms 8.64%
full cvs-adc1 4.30 ms < 4.52 ms = 0.22 ms 5.03%
full adc2 322.69 ms 293.08 ms = -29.61 ms -9.18%
full cvs-adc2 194.76 ms < 196.67 ms = 1.90 ms 0.98%
full adc2x 3210.40 ms 3131.25 ms = -79.15 ms -2.47%
full cvs-adc2x 1763.66 ms 1755.50 ms = -8.16 ms -0.46%
full adc3 3597.51 ms 3473.48 ms = -124.03 ms -3.45%
full cvs-adc3 6114.51 ms 2063.04 ms = -4051.47 ms -66.26%
I'll add a case to try an unrestricted reference now as well.
As clearly visible there were quite a few low-hanging fruits to capture for CVS-ADC(3), which turn out to be quite worth picking. Could be of interest for @t-fransson in the future.
I'm already running a test with PNA, adc2x/aug-cc-pvdz. I can kick off some tests with your script tomorrow. 👍🏻
I see. Would it be much of a hassle to restart with the current version? I made quite a few improvements and it would be cool to see if they play off in practice or not.
This pull request introduces 3 alerts when merging 545c6e775bc4fe95e2be0d352756a4c539acf4ce into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
I'm pretty convinced about the python stuff now, so I'll start removing the C++ interface. For doing the comparison @maxscheurer you will therefore need to use one of the previous commits, e.g. 545c6e775bc4fe95e2be0d352756a4c539acf4ce
This pull request introduces 4 alerts when merging 6079f03fbd322ac4c46fb4eec7b3231060643f0e into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
Okay, cannot restart right now, but will run again tomorrow with a previous commit including the comparison script.
This pull request introduces 5 alerts when merging e79f67d0e8c92ce2a9e004a170a293fa27b68977 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 8d94d71b61b4f66e668dba2cc1571881296b8198 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging fba958cae44af5400ad7d855b2395fad3b35d737 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
will be updated
repeat=20, n_threads=8, takes=10
8 threads are fine, usually well busy during the benchmark run 👍
[x] pNA/cc-pVDZ
[ ] pNA/aug-cc-pVDZ (Running 🚀)
#
#-- pna cc-pvdz
#
nocc = 72 nvirt = 268
full adc0 2.71 ms < 3.17 ms = 0.46 ms 17.04%
full cvs-adc0 1.66 ms < 2.05 ms = 0.40 ms 23.88%
full adc1 102.78 ms < 105.62 ms = 2.84 ms 2.76%
full cvs-adc1 9.29 ms 8.86 ms = -0.44 ms -4.69%
full adc2 1620.54 ms 1384.09 ms = -236.45 ms -14.59%
full cvs-adc2 486.65 ms < 503.37 ms = 16.71 ms 3.43%
full adc2x 16562.88 ms 16312.25 ms = -250.63 ms -1.51%
full cvs-adc2x 3272.96 ms < 3286.17 ms = 13.21 ms 0.40%
full adc3 17893.81 ms 17697.08 ms = -196.73 ms -1.10%
full cvs-adc3 57672.94 ms 3461.21 ms = -54211.72 ms -94.00%
Run with 2 threads, adc3
crashed due to memory not being cleaned up?!
#
#-- pna aug-cc-pvdz
#
nocc = 72 nvirt = 496
full adc0 2.25 ms < 2.43 ms = 0.18 ms 7.94%
full cvs-adc0 1.22 ms < 1.48 ms = 0.26 ms 21.51%
full adc1 483.81 ms < 488.21 ms = 4.40 ms 0.91%
full cvs-adc1 10.83 ms 7.99 ms = -2.85 ms -26.30%
full adc2 12334.86 ms 12036.59 ms = -298.28 ms -2.42%
full cvs-adc2 3731.10 ms < 3767.35 ms = 36.25 ms 0.97%
full adc2x 206003.00 ms 202902.75 ms = -3100.25 ms -1.50%
full cvs-adc2x 48776.73 ms < 48907.87 ms = 131.14 ms 0.27%
Cool, so ADC(0) methods we can definitely ignore since they are just for testing. Otherwise python is roughly equally good in the worst case I'd say. Our new CVS-ADC(3) really shines. Quite impressive.
Out of the bad ones CVS-ADC(2) probably hurts the most, but even there it is ok. Here even with 50 iteration steps it would just be 1s on a run that takes 25 minutes, so that's probably ok. Let's see how it looks for a bigger basis with more virtuals, so where the CVS-ADC(2) times might start to matter more.
On the upside for ADC(2) we actually win quite a bit. I would not have thought that 10% gain to be so universal.
This pull request introduces 1 alert when merging 359ff5e687b90698816d8e23f3079c64347136e2 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 4e0daa7f7cf4a32c00c4a0dd2fe128210a22c5ae into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging b4e1d245074afe2aeb037b3cafe4932281e455d6 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 02cd6f94b0da8846c3932f611d0e763999b7db09 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
Ups thanks for noticing. I'll update ;)
This pull request introduces 1 alert when merging ee36366ace923942abb5737d86782a4bc932d8b7 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
Yes and if you count the implicit C++ code we can remove ...
I think I have about one more evening of coding and then one of cleanup and then it should be good.
This pull request introduces 2 alerts when merging d523af6f807a316d28c343cb120c20c7725cb33e into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 7a612e53281515a1ec1bd13ecf816d104dd5ac00 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 5ab7208a2ed02cfa749e5cd0427a2c9850728b64 into 3afd0bb9977c1c83ee178d6d3a5ef5b4117a582e - view on LGTM.com
new alerts:
I will give this a quick look, then it's g2g.
@maxscheurer Any updates on the timings?
Otherwise looks good to go from my end! Feel free to re-review or merg.
Oh and picking up your earlier comments ... it only adds a net of 350 lines after all :smile: (and there are quite a few deprecation hints we can remove soon).
Just updated the timings (aug-cc-pvdz on 2 cores), and they look good 👍
Still at a very early stage. Just to stash some work ...
Done in #77:
Done in this PR:
blocks
,"s"
,"d"
etc.Not so clear stuff:
hf.b.cv
versusb.cv
versushf.b.Cv
??[ ] Quick-access like functionality for T2 amplitudes and density matrices?:arrow_right: postponed[ ] Quick-access like functionality for other things of LazyMp?:arrow_right: postponed