theochem / denspart

Atoms-in-molecules density partitioning schemes based on stockholder recipe
GNU General Public License v3.0
19 stars 11 forks source link

Explicitly adding a constraint does not work #11

Closed yingxingcheng closed 1 year ago

yingxingcheng commented 1 year ago

By adding constraints explicitly, i.e., $\sum{A,i} c{Ai} = \int \rho(\vec{r}) d\vec{r} = N_{mol}$, we cannot necessarily obtain the same result. This equality constraint may be too strict for the minimizer.

    with np.errstate(all="raise"):
        # The errstate is changed to detect potentially nasty numerical issues.
        # Optimize parameters within the bounds.

        A = np.zeros((len(pars0), len(pars0)), float)
        A[0, 0:-1:2] = 1
        b = np.zeros((len(pars0, )))
        b[0] = pop
        print(A, b)
        constraint = LinearConstraint(A, lb=b, ub=b, keep_feasible=True)
        bounds = sum([fn.bounds for fn in pro_model.fns], [])

        optresult = minimize(
            cost_grad,
            pars0,
            method="trust-constr",
            jac=True,
            hess=SR1(),
            bounds=bounds,
            constraints=constraint,
            callback=callback,
            options={"gtol": gtol, "maxiter": maxiter},
        )

output:

  126      6    2.0664675    2.0664675  -0.0000e+00   6.1163e-01    0.0484080
-----  -----  -----------  -----------  -----------  -----------  -----------
Optimizer message: "`xtol` termination condition is satisfied."
Total charge:             -2.3269698e-06
Sum atomic charges:       -2.3269698e-06
Promodel
 ifn iatom  atn       parameters...
   0     0    8       1.32925343     18.58879411
   1     0    8       4.66929957      3.48991506
   2     1    1       2.86085588      1.14814615
   3     2    1       1.14059344      1.89308011
Computing additional properties
Sum of charges:  -2.326969790633626e-06
AIM charges: [ 2.001447   -1.86085588 -0.14059344]

Without the constrain, the output is:

   50     44    0.2242531    0.2242531   4.4487e-08   6.7676e-09    0.0521780
-----  -----  -----------  -----------  -----------  -----------  -----------
Optimizer message: "`gtol` termination condition is satisfied."
Total charge:             -2.3269698e-06
Sum atomic charges:       -2.3714572e-06
Promodel
 ifn iatom  atn       parameters...
   0     0    8       1.48539364     18.89318746
   1     0    8       7.05783539      2.68722754
   2     1    1       0.72838059      2.76389621
   3     2    1       0.72839276      2.76387596
Computing additional properties
Sum of charges:  -2.371457152872125e-06
AIM charges: [-0.54322902  0.27161941  0.27160724]
PaulWAyers commented 1 year ago

No idea what is going wrong, but the results (this is water, I think) are crazy. It's even more crazy since the Lagrange multiplier for this constraint was automatically (implicitly, by using extended KL divergence) incorporated in the objective function, as I recall.

@FarnazH has a constrained population analysis code that is working well, as I recall. But it doesn't have MBIS in it I think.

yingxingcheng commented 1 year ago

Yes, the current implementation uses an explicit multiplier (=1). However, the total charge of water doesn't equal the sum of charges from the calculations. Although adding explicit constraints seems to satisfy the condition, the results become strange. I have implemented some new AIM methods in the horton-part package based on Horton using constraint optimization (not for MBIS as well). I'm doubting if the current method is robust for all molecules, i.e., one-step optimization. Moreover, it could be computationally expensive for large molecules.

PaulWAyers commented 1 year ago

You mean that the charges don't add up exactly to the total charge? The error of ~1e-6 is just going to be fixed if the convergence tolerance is tighter, which probably requires a tighter grid.

I think Farnaz's implementation also uses scipy.opt, but there are certainly better strategies (in her thesis, but require more implementation).

yingxingcheng commented 1 year ago

You mean that the charges don't add up exactly to the total charge?

yes

PaulWAyers commented 1 year ago

Whats the issue with an error of ~1e-8 relative to the (numerical) intergration error that is already ~1e-6? The error in the constraint is already >10 times less than the error in the integration, so I'm confused as to why it's problematic.

yingxingcheng commented 1 year ago

Sorry for the confusion. The negligible deviation between the total charges and the sum of atomic charges is not problematic. The problem arises when explicit constraints are added, as this deviation becomes zero regardless of the grid used, but atomic charges are not correct anymore. This issue could be attributed to two factors:

If we assume that point 1 is okay, point 2 needs further testing.

PaulWAyers commented 1 year ago

I see. It may be worth looking at a more realistic example, where the explicit constraint is truly additional. The classic one we'd worked with was constraining protein backbone charges to match MM force fields, for example. But you could also constrain the charges of residues in a protein to all be the appropriate integers.

The objective function in MBIS has local minima (even without constraints), but the initial guess is usually very good. The multiple local minima problem is inevitable I think, but one can hope to find a good minimum that is (as close as possible) to the unconstrained calculation with a suitable initial guess. Near-flat surfaces are, I suspect, less problematic: that just means there are several "equally good" charges by the criterion one is using, and (if one cares) one could select one via some penalty. I feel like (without much evidence) this "equally good" case doesn't show up often, as we haven't observed cases where the charges change drastically with small changes of configuration, and one would expect to see the charges shift quite a bit in response to changes in the objective function if the objective function were very flat near the minimum. Of course, the Hessian of the objective function can tell you explicitly whether the "flat minimum" problem is there.

tovrstra commented 1 year ago

I believe the error is entirely due to the integration grid in this case. Is this the h2o_sto3g example?

yingxingcheng commented 1 year ago

Yes, I have checked a new grid with -a 1000 -r 1000, and the error is gone, but the charges are still not resonable. The following is the input and output.

cat runall.sh

denspart-from-horton3 h2o_sto3g.fchk density.npz -r 1000 -a 1000
denspart density.npz results.npz
denspart-write-extxyz results.npz results.xyz

The log file:

Optimizer message: "`xtol` termination condition is satisfied."
Total charge:              2.1788935e-09
Sum atomic charges:        2.1788938e-09
Promodel
 ifn iatom  atn       parameters...
   0     0    8       0.68039661     15.88102364
   1     0    8       6.40835281      1.76085726
   2     1    1       2.02002220      2.25293080
   3     2    1       0.89122838      0.92313610
Computing additional properties
Sum of charges:  2.17889384312997e-09
cat result.xyz

3
Properties=species:S:1:pos:R:3:charges:R:1:rcubed:R:1:valence_charges:R:1:core_charges:R:1:valence_widths:R:1
O    -2.3534315116    1.7976043966    0.0000000000    0.9112505777    3.2136430957   -6.4083528080    7.3196033857    0.3005224922
H    -1.3674018242    1.8792998063    0.0000000000   -1.0200221978    0.5978496936   -2.0200221978    1.0000000000    0.2348839174
H    -2.6055654975    2.7543473799    0.0000000000    0.1087716223    0.5206776455   -0.8912283777    1.0000000000    0.5732385607
tovrstra commented 1 year ago

OK. Thanks for clarifying.

Just to be clear: this the result you get when trying to impose the constraint explicitly, as in your code above?

yingxingcheng commented 1 year ago

this the result you get when trying to impose the constraint explicitly, as in your code above?

Yes

tovrstra commented 1 year ago

It seems SciPy's trust-constr should not be given additional constraints of the form you've specified. I've personally never worried too much about such a feature, so I cannot comment on it at this stage. This specific issue may be fixable one way or the other, but I cannot make it a priority now.

In general, constraining the quadrature error to zero is not guaranteed to remove the quadrature error of other non-zero quantities. It may also increase errors on other quantities, hard to tell in advance, so I would not recommend that approach, even if it works for the total charge.

I'm going to close this issue, as I don't see merit in fixing it. That said, constraints may be useful in other scenarios, but I'm not working on these.