jjgoings / McMurchie-Davidson

do a simple closed shell Hartree-Fock using McMurchie-Davidson to compute integrals
BSD 3-Clause "New" or "Revised" License
78 stars 17 forks source link

grad.pyx: ln 282 and ln 335 #23

Closed shaobin closed 2 years ago

shaobin commented 2 years ago

Should "for v in range(n1+n2+1+r1z+1)" (line 282 and 335) be "for v in range(n1+n2+1+r1z)"?

jjgoings commented 2 years ago

I think you are correct, but I need to take a closer look! Happy to accept PR if you fix and add test, otherwise I'll get to it later.

shaobin commented 2 years ago

By your design, for the six indices t, u, v, tau, nu, phi (five for E and one for Ex), only the one for calling Ex should be with an additional +1, right? If nothing special for the (center == c, x == 2) and (center ==d, x == 2) cases, then I guess this might be a mistake.

After removing the +1 for both two lines, the existing code run smoothly. I've tested it in calculating forces. As the difference of the final results is negligible, I don't know how to well test the changes. Any thoughts?

pwborthwick commented 2 years ago

Hi, Josh's test002.py has commented out a test of McMurchie-Davidson with force reference calculations done on Gaussian 16. (I think Josh commented it out simply because it took time to run). The test is for water and as you see the non-zero values compare as... reference | mmd ........................................................... 0.097441437 | 0.0974414370 Oy -0.086300098 | -0.0863000979 Hx -0.048720718 | -0.0487207185 Hy All good there!

Having said that I'm sure the extra '+1' in the loop conditions are a typo (could it be a cut-and-paste problem). I've 'corrected?' the code in my version of mmd and re-compiled the cython and it makes no difference at all to the printed accuracy of the results for this molecule. Unfortunately I couldn't make a direct comparison with my gradient implementation because I've parameterized the loop blocks reducing the 4x3 blocks to just one and a calling loop, but that enforces that there is only one doubly augmented loop in each block. I've had a quick look at Helgaker and couldn't see where the extra 1 might come from.

shaobin commented 2 years ago

@pwborthwick , I also think this should be a typo caused by cut-and-paste. After checking the code path, I think the extra '+1' does not do actual harm for the force calculation test case:

In the case r1z == 0, the last value of the index v is n1+n2+1. Calling E(int i,int j,int t,double Qx,double a,double b, int n, double Ax), which is defined in util.pyx, we will take the n == 0 and t > (i +j) code path (line 17 and 18). Zero will be returned. So the final value should be the same; but some CPU cycles are wasted in doing more loops.

jjgoings commented 2 years ago

Thanks all. I agree this is a bug, but thankfully not one that affects the results. I've done multiple tests now and it seems to have no affect beyond performance.

jjgoings commented 2 years ago

Fixed in 2f8b77d6661ba5ff1e2111d469839ca2160a1c80