RMeli / irc

Transfrormation between Cartesian coordinates and redundant internal coordinates
MIT License
22 stars 8 forks source link

cartesian_to_irc produces 180 degree dihedrals with random sign #52

Open AlexB67 opened 3 years ago

AlexB67 commented 3 years ago

Hello, I hope I can explain this right, but I observed the issue as described in the title. Take ethane as an example. it produces the following coordinates:

New internal
  (redundant) coordinates: 28
*************************************
  7 bonds / Å
*************************************
  (   0,   1)        1.4613
  (   0,   2)        1.0941
  (   1,   3)        1.0941
  (   0,   4)        1.0941
  (   1,   5)        1.0941
  (   0,   6)        1.0941
  (   1,   7)        1.0941
*************************************
  12 angles / °:
*************************************
  (   1,   0,   2)  111.84
  (   0,   1,   3)  111.84
  (   1,   0,   4)  111.84
  (   2,   0,   4)  107.00
  (   0,   1,   5)  111.84
  (   3,   1,   5)  107.00
  (   1,   0,   6)  111.84
  (   2,   0,   6)  107.00
  (   4,   0,   6)  107.00
  (   0,   1,   7)  111.84
  (   3,   1,   7)  107.00
  (   5,   1,   7)  107.00
*************************************
  9 dihedrals / °
*************************************
  (   2,   0,   1,   3)   -60.00
  (   3,   1,   0,   4)    60.00
  (   2,   0,   1,   5)   180.00
  (   4,   0,   1,   5)   -60.00
  (   3,   1,   0,   6)   -180.00
  (   5,   1,   0,   6)    60.00
  (   2,   0,   1,   7)    60.00
  (   4,   0,   1,   7)  -180.00
  (   6,   0,   1,   7)   -60.00
*************************************

If you run cartesian_to_iirc successive times the outcome of the 180 dihedral angles flip sign. it may give

 (   2,   0,   1,   5)   180.00
 (   3,   1,   0,   6)  -180.00
 (   4,   0,   1,   7)  -180.00

or sometimes

(   2,   0,   1,   5)  -180.00
(   3,   1,   0,   6)  180.00
(   4,   0,   1,   7)  180.00

but only for the 180 degree case.

As you can imagine this can pay havoc with geometry optimization because dihedrals that have flipped 360 degrees during a geometry optimization step can give rogue values for dq,

one can get dq = 2 pi when it should have been dq = 0, or something very small.

Yesterday, I put a plaster in my own code to detect this flip, so I could work around the issue. Of course, it would be nice to see what the issue in libirc is.

Thus far, I've made the following change; It works in all my tests so far, but not knowing your code base very well I don't if there could be unwanted consequences

In constsnts.h I added

// dihedrals near 2 pi
constexpr double dihedral_near_180_tol{1e-6};

and in connectivity.h I changed the following lines, around 440 thereabouts

// Compute dihedral angle in radians (in the interval [-pi,pi])
  double angle{std::atan2(y, x)};

  if (std::fabs(std::fabs(angle) - tools::constants::pi) < tools::constants::dihedral_near_180_tol) 
        angle = std::fabs(angle);

  return angle;

What do you think ?

Best regards.

oh ad thanks for the library, great stuff.

AlexB67 commented 3 years ago

Okay, my trick does have a knock on. My own plaster outside libirc still works though, Something more is required. Your input would be appreciated.

As an example from my code when validating the back transform. Observe the values flipped sign for coordinate 22 and 27 , so the error is 2pi when it should have been 0 or close.

IRC to cartesian back-step converged in 3 iterations.
  New cartesian coordinates: 

*****************************************
          Back transform validation of q
   #    q_expected        error
*****************************************
   1    2.875207261253   -0.000000000000
   2    2.065606707972    0.000000000021
   3    2.065606707941    0.000000000021
   4    2.065606707940    0.000000000021
   5    2.065606707972    0.000000000021
   6    2.065606707940    0.000000000021
   7    2.065606707939    0.000000000021
   8    1.940904706174   -0.000002289793
   9    1.940904706050   -0.000002289793
  10    1.940904706004   -0.000002289793
  11    1.879418451977   -0.000002147858
  12    1.940904706156   -0.000002289793
  13    1.879418451766   -0.000002147858
  14    1.940904706028   -0.000002289793
  15    1.879418451767   -0.000002147858
  16    1.879418451704   -0.000002147859
  17    1.940904706000   -0.000002289793
  18    1.879418451736   -0.000002147859
  19    1.879418451946   -0.000002147858
  20   -1.047197551214   -0.000000000000
  21    1.047197551411   -0.000000000001
  22   -3.141592653591    6.283185307180
  23   -1.047197550966   -0.000000000000
  24   -3.141592653589    0.000000000000
  25    1.047197551213    0.000000000000
  26    1.047197551018    0.000000000000
  27    3.141592653643   -6.283185307180
  28   -1.047197551358    0.000000000000

It would easy enough to add the ethane test case to your tests. I haven't done that to see what happens.

Cheers.

peterbygrave commented 3 years ago

I think you should be determining all differences (including "error" column) in dihedral angles in the [-180, 180) domain. If you don't do this then -179 to 179 is a 358 change rather than the more logical 2 change. The thresholding only helps for that small window, you still have the same problem at -180+thr+epsilon.

peterbygrave commented 3 years ago

The reproducibility of the values is a little more concerning, one should be able to write a test just in the IRC library that confirms there is a problem.

RMeli commented 3 years ago

The reproducibility of the values is a little more concerning, one should be able to write a test just in the IRC library that confirms there is a problem.

Agreed. I added a test with ethane where cartesian_to_irc is called multiple (1000) times and the results are consistent on my end.

@AlexB67 could you please provide a bit more information on the system and the functions you are calling so that I can try to reproduce the issue? If I use the following cartesian coordinates (in angstrom)

8

C -0.7560    0.0000    0.0000
C  0.7560    0.0000    0.0000
H -1.1404    0.6586    0.7845
H -1.1404    0.3501   -0.9626
H -1.1405   -1.0087    0.1781
H  1.1404   -0.3501    0.9626
H  1.1405    1.0087   -0.1781
H  1.1404   -0.6586   -0.7845

I consistently get the following internal coordinates for the dihedral angles (in radians):

 1.0472
 3.1416
-1.0472
-1.0472
 1.0472
 3.1416
 3.1416
-1.0472
 1.0472

Are you talking about successive back and forth transformations?

AlexB67 commented 3 years ago

Hi,

Thanks for replying. Leave it with me. I would hate to discover it;s something on my end I am doing. I understanding the code base bit by bit. When I have a bit of time, it'll require a strong coffee, and make sure I understand it better.

Yes, I am talking about successive back and forth. Please don't spend any time on my behalf on it until I have a better understanding.

Best ...

AlexB67 commented 3 years ago

I think you should be determining all differences (including "error" column) in dihedral angles in the [-180, 180) domain. If you don't do this then -179 to 179 is a 358 change rather than the more logical 2 change. The thresholding only helps for that small window, you still have the same problem at -180+thr+epsilon.

Yes, I had considered that It was a kind of temporary thing to experiment if it worked around the near 180 case in my geometry optimization, so for now I changed nothing in libirc

I thought i had a bug in my BFGS step, but I don't, that's how I came to discover the whole thing. All the updates for RFO, ,BFGS Hessian matched the output of Psi4 once i just kept track of the sign flips in my own code, but it's probably not 100% full proof.

Best ...

AlexB67 commented 3 years ago

The reproducibility of the values is a little more concerning, one should be able to write a test just in the IRC library that confirms there is a problem.

I'll also do this. i'll keep you posted.

Thx.

RMeli commented 3 years ago

@AlexB67 I added a small test for consecutive forward and backward transformations (this is where the error appears if I understand correctly?) as follows:

    const vec irc_0 = cartesian_to_irc<vec3, vec>(x_c, B, A, D, LA, OOPB);
    vec irc = irc_0;

    size_t n_irc = irc.size();

    // Small test on reproducibility, related to issue #52
    for(std::size_t n{0}; n < 100; n++){
      auto result = irc_to_cartesian<vec3, vec, mat>(
          irc, linalg::zeros<vec>(n_irc), x_c, B, A, D, LA, OOPB);

      REQUIRE(result.converged);

      const vec x_c_new = result.x_c;
      irc = cartesian_to_irc<vec3, vec>(x_c_new, B, A, D, LA, OOPB);

      // Check IRC are the same as the first one
      REQUIRE(irc.size() == irc_0.size());
      for(size_t i{0}; i < n_irc; i++){
        CHECK(irc(i) == Approx(irc_0(i)));
      }
    }

Unfortunately, it does not seem to reproduce the problem (internal coordinates always match the first conversion).

Please correct me if I am misunderstanding the problem, but without more details on how to reproduce it in the library it will be tricky to pinpoint the issue.

AlexB67 commented 3 years ago

You are understanding it perfectly. I'll dig deeper. thx for that.

AlexB67 commented 3 years ago

Hello,

Not had much time, but what I found thus far. The inconsistent signs happen as the structure changes during a geometry optimization which is not captured in the above test.

This has led me to the following test. I have taken two structures, one has been optimized previously, not the other, They are both D3d.

Taking either as just an input structure the following happens. (Note: No geometry optimization happening here in the current run).

Generate internal coordinates using optimized structure 2 as input. Dihedrals are reported like so

My input structure

 Atom  X / bohr          Y / bohr          Z / bohr          Mass
***********************************************************************
   C    0.000000000000   -0.000000000000   -1.440231664303   12.0000000
   C    0.000000000000   -0.000000000000    1.440231664303   12.0000000
   H    1.667423757259    0.962687555107   -2.187063524327    1.0078250
   H    1.667423757259   -0.962687555107    2.187063524327    1.0078250
   H    0.000000000000   -1.925375110213   -2.187063524327    1.0078250
   H   -1.667423757259   -0.962687555107    2.187063524327    1.0078250
   H   -1.667423757259    0.962687555107   -2.187063524327    1.0078250
   H    0.000000000000    1.925375110213    2.187063524327    1.0078250

irc says

*************************************
  9 dihedrals / °
*************************************
  (   2,   0,   1,   3)   -60.00
  (   3,   1,   0,   4)    60.00
  (   2,   0,   1,   5)  -180.00
  (   4,   0,   1,   5)   -60.00
  (   3,   1,   0,   6)  -180.00
  (   5,   1,   0,   6)    60.00
  (   2,   0,   1,   7)    60.00
  (   4,   0,   1,   7)   180.00
  (   6,   0,   1,   7)   -60.00
*************************************

Using the original unoptimized structure 1 as input

input structure
Atom  X / bohr          Y / bohr          Z / bohr          Mass
***********************************************************************
   C    0.000000000000    0.000000000000   -0.944863062727   12.0000000
   C    0.000000000000    0.000000000000    0.944863062727   12.0000000
   H    1.636550954077    0.944863134536   -2.078698737999    1.0078250
   H    1.636550954077   -0.944863134536    2.078698737999    1.0078250
   H    0.000000000000   -1.889726267183   -2.078698737999    1.0078250
   H   -1.636550954077   -0.944863134536    2.078698737999    1.0078250
   H   -1.636550954077    0.944863134536   -2.078698737999    1.0078250
   H    0.000000000000    1.889726267183    2.078698737999    1.0078250

irc says
*************************************
  9 dihedrals / °
*************************************
  (   2,   0,   1,   3)   -60.00
  (   3,   1,   0,   4)    60.00
  (   2,   0,   1,   5)   180.00
  (   4,   0,   1,   5)   -60.00
  (   3,   1,   0,   6)   180.00
  (   5,   1,   0,   6)    60.00
  (   2,   0,   1,   7)    60.00
  (   4,   0,   1,   7)   180.00
  (   6,   0,   1,   7)   -60.00
*************************************

The sign flips are happening in cartesian_to_irc. Where In some cases it generates a -180 angle for structure X, but +180 for structure Y.

My gut says that since the dihedral angles are not changing it this structure, apart from some numerical noise during optimization. The signs should stay consistent, but they are not. my fault ? libirc ? Don't know yet. In any case, the above test is independent of any geometry optimization, I am just reporting the libirc internal coordinates given a Cartesian input structure.

Any thoughts on the different reported dihedrals for both structures. It's only the 180 degree case. Other angles are reported consistently. I've not made an isolated test case in libirc to double check.

Cheers

AlexB67 commented 3 years ago

Okay. I added 2 structures to the test suite to isolate it from my code. I get the same results. I've experimented with the internals of irc a bit, but no cigar as yet. I suppose, one question that arises, should dihedrals need to have a negative sign at all. The WDC convention is for dihedrals to be always positive.

RMeli commented 3 years ago

@AlexB67 Thanks for the additional details, are the test available somewhere? If you could open a PR against the issue-52 branch it would be great, so that the problems will be picked up by CI and I can also try tor reproduce it locally.

(I don't have a lot of spare time this week, so it might take a while.)

AlexB67 commented 3 years ago

Hi, Thx, Don't sweat it, whenever you can, I'll add them. Busy myself so not sure when either, end of week perhaps or I may find some time in between. Everything else is working great. My current plaster fix in my l own code has worked in all cases anyways. A testament to your code that all my results tally with the psi4 suite, i.e. results in the same optimised structures with the Schlegel hessian I added.

AlexB67 commented 3 years ago

FWIW The closest I have come as a code smell fix. Adding this one line to the connectivity function for dihedrals, though it can induce very small rounding errors transforming to and from irc to cartesian, in practice quite negligible I think.

All your test still pass with this change, including my new one, and my own code behaves without requiring my current plaster outside libirc, I admit I don't feel comfortable about doing it this way.

Making dihedral angles other than 180 all positive as per WDC convention makes everything go bust, more serious intervention would be required.

The reason the sign flips is the atan2(y/x) call sometimes has a value of -0 or +0 for y in the numerator for the different structures.

That's all I can offer this week.

// Code smell addition

template<typename Vector3>
inline double dihedral(const Vector3& v1,
                       const Vector3& v2,
                       const Vector3& v3,
                       const Vector3& v4) {
  const Vector3 b1{v1 - v2};
  const Vector3 b2{v2 - v3};
  const Vector3 b3{v3 - v4};

  Vector3 n1{linalg::cross(b1, b2)};
  Vector3 n2{linalg::cross(b2, b3)};

  n1 /= linalg::norm(n1);
  n2 /= linalg::norm(n2);

  const Vector3 m{linalg::cross(n1, b2) / linalg::norm(b2)};

  const double x{linalg::dot(n1, n2)};
  const double y{linalg::dot(m, n2)};

  // Compute dihedral angle in radians (in the interval [-pi,pi])
  double angle{std::atan2(y, x)};
  // only for the 180 case
  if (std::fabs(y) < 1e-6 && std::fabs(std::fabs(angle) - tools::constants::pi) < 1e-6)
      angle = std::atan2(fabs(y), x);

  return angle;

// end code smell

Cheers :)