GridOptimization - Githubissues

dgasmith commented 5 years ago

Currently we have the TorsionDrive service which causes geometry optimizations in a wavefront propagation, as this can spawn multiple geometry optimization without dependance this makes for a good QCFractal service. Currently a single optimization looks something like:

{
 'keywords': # geometry optimizer keywords
 'input_specification' # A QCSchema input specification minus a molecule
 'initial_molecule' # The initial molecule

# Above is input, below is the output bit
 'trajectory' # A ordered list of single results
 'energies' # A ordered list of energies
 'final_molecule' # The final molecule specification (sometime not in the trajectory itself)
}

If we have a GridOptimization we will likely want to reuse the single optimization framework. Would we want something similar like:

{
 'keywords': # geometry optimizer keywords
 'input_specification' # A QCSchema input specification minus a molecule
 'initial_molecule' # The initial molecule

 `optimizations` # A list (?) of optimizations performed
}

What other top level fields do we want?

@leeping @yudongqiu

leeping commented 5 years ago

Here is what I'm imagining: 1) One of the optimizations on the grid will be run first, starting from the user-provided structure, and all the others would depend on a previous one. There is a dependence graph structure to the calculations (see image). 2) The user should be able to provide an option called absolute or relative. In the case of absolute, the first optimization is constrained to the nearest grid point, and the constraint values are absolute. In the case of relative, the first optimization is fully relaxed, and the constraint values are relative to the first optimized structure. 3) The user should provide a list of lists of constraint values, starting with the leading dimension.

wechatimg2

dgasmith commented 5 years ago

It might be good to start with the 1D case and move to 2D.

In general I think this should likely support n-dimensions and arbitrary combinations of bond/angle/torsion scans?

Looking at the potential cost we may consider driving this as a service just so that we can checkpoint the jobs.

leeping commented 5 years ago

Sure, I agree starting from 1D is a good idea. Jessica Maat in the Mobley group is already starting to generate and use 2D data so it would be optimal to support that as well.

Lee-Ping

dgasmith commented 5 years ago

Apologies, I meant 1D when describing the spec.

What we are really after her is how should access this grid data? It seems that you have primary and trailing axes, but for example you likely do not want to access that did as something like optimization_grid[0][0]. This one isn't nearly as obvious to as a single optimization trajectory or a TorsionDrive as those have well defined labels.

Should each part of the resulting grid just be a single Optimization object, or do we want some metadata there as well? The best way to get at this layout is through use cases, so it might be worth thinking about how you would like to access the data.

For TorsionDrives we a function like:

>>> torsiondrive.list_final_energies()
{"(-180, )" : -10, "(180, )": 10} # key: energy

>>> torsiondrive.list_final_results()
{"(-180, )" : ResultsObject, "(180, )": ResultsObject} # key: full results object

where the ResultsObjects have everything from final energies to Wiberg Bond orders.

What would be analogous here?

leeping commented 5 years ago

I think each part of the resulting grid should be an Optimization object; currently drawing a blank on what other data might be needed at the individual grid points.

dgasmith commented 5 years ago

Right, so each will be an Optimization object (we will have shortcuts to energy etc). But what is the label for each index? For torsiondrive the torsions had to be integers so that worked out as keys, here we will have floats for bonds and perhaps angles (I believe) so we will need a key like (0, 0). How will the user understand at what the distance was frozen at for the bond? In this grid which was the previous computation etc?

Hopefully this is helping, we can schedule a call soon if not!

Pinging @vtlim as well.

leeping commented 5 years ago

This definitely helps. One possibility is we add a method called GridOptimization.get_constraint_values() and that will return a dictionary that maps keys like (0, 0) to values like (1.81, 150.0). What do you think?

dgasmith commented 5 years ago

Sure, we can do that. In that case we will also need some kind of description of the what each dimension means in that case. How would we specify this?

grid = [{"type": "bond", "range": [1.1, 1.2], "step": 1.e5}]

What other kind of metadata will we need here?

leeping commented 5 years ago

I think that having unevenly spaced points (within a dimension) could be useful in certain use cases other than ours (for example, if you were generating structures for interaction energy calculations). Could we provide the full list of values for each dimension in the output?

dgasmith commented 5 years ago

Sure we would need to differentiate between scan and perhaps another keyword that explicitly defines a list, perhaps enumerate?

We can add a category or something like this which will differentiate the two. Keywords need some overhaul, but a feasible concept.

leeping commented 5 years ago

I would be happy with having just enumerate or both. The advantage of enumerate is that you could see the numbers themselves, e.g. 0.0 0.1 0.2 0.3 0.4 0.5. With geomeTRIC's scan I found it's easy to generate grid points that are different from the original intent, e.g. a grid of 10 points from 0.0 to 1.0 would end up giving 0.0 0.1111 0.2222 0.3333 ... 0.8888 1.0.

dgasmith commented 5 years ago

In that case we might just have:

grid = [{"type": "bond", "indices": [0, 1], "steps": [1.0, 1.01, 1.02, 1.1]}, ...]

Where the result would be the Cartesian product between all elements in the grid. Is it possible you want to do sparse grids?

Looking at your diagram again, we may be able to parallelize this where say if I am doing a 2D scan I can do grid (0,0) then spawn (0, 1) and (1, 0). Is this true?

leeping commented 5 years ago

I have no plans to do sparse grids at the moment. I could imagine some applications, for example if we wanted to scan the conformations of a ring (that sort of gets into torsiondrive territory).

(The above is not truly a sparse grid, but an example where we might want to provide a list of grid points, as one would in a sparse grid.)

I think that in a 2D grid that's a Cartesian product between two 1D grids, the maximum number of jobs that could run in parallel is 2*len(leading_dimension).

dgasmith commented 5 years ago

Interesting ok, between the possible parallelization and the snapshotting we might try to implement this as a service.

I think this is mostly what I need to give it a first pass. @vtlim do you have a small test case (~5 heavy atoms) for 1 and 2D scans you might want?

leeping commented 5 years ago

scan-1D.xyz: Example of a 1D scan of the improper dihedral angle involving indices 3, 1, 0, 12. 20 points from -40 to +40 degrees. (I would personally have used 21 points :)

scan-2D.xyz: Leading dimension is the improper dihedral angle involving indices 3, 1, 0, 12; 8 points from 0 to -70 degrees. Trailing dimension is the angle involving atoms 0, 3, 1; 11 points from 100 to 150 degrees.

Archive.zip

leeping commented 5 years ago

I think this might be @jmaat 's project, not @vtlim 's but correct me if I'm wrong.

leeping commented 5 years ago

geomeTRIC's scan feature is only able to start from one corner of the grid, and it only supports "absolute" and not "relative" constraint values. Thus, the GridOptimization feature request should generate slightly different (and superior) results compared to the ones I uploaded above.

dgasmith commented 5 years ago

Ah, I apologize if I switched the projects! Do you have a specific molecule in mind?

leeping commented 5 years ago

No problem, the molecule from @jmaat is provided in the zipped file above. :)

dgasmith commented 5 years ago

Hmm, that didn't show up for me originally. I have it now.

Edit: Yea for 2D we are talking about thousands of gradient evaluations. Let's parallelize that.

vtlim commented 5 years ago

Yes, this is @jmaat’s project.

dgasmith commented 5 years ago

@jmaat @leeping Will the optimization method for the grid and the initial optimization ever differ or should we trust that these two be the same?

Allowing them to be different isn't hard to do, but it doesn't seem like it should be something that we should allow in general.

leeping commented 5 years ago

I don't think the optimization methods for grid and initial optimization will ever be different (save for different constraints; the grid will have more).

I do think it will be helpful to support "freeze" or "scan" constraints for both the initial optimization and the grid, what do you think? These constraints should be different from those that are being scanned on the grid.

jmaat commented 5 years ago

@dgasmith

This is a follow up from our meeting yesterday. For the improper project, in immediate we need (1) optimized geometries and (2) energies. We also eventually will need (3) Wiberg bond order as output. We also will need (4) all of the quantities @leeping needs for valence parameters, such as vibrational frequencies.

dgasmith commented 5 years ago

Great thanks!

Wiberg is no problem, but the vibrational frequencies will take a bit more time (though we could write a simple wrapper to generated these from Hessians).

dgasmith commented 5 years ago

@leeping When running over this grid we have shown two ways to do this:

1) Starting along the first dimension and then propagating along the second dimension as the first completes:

 O -> O -> O -> X
 | -> O -> X
 O -> X
 |

2) Starting along the first dimension and then propagating front he middle of the second (similar to the hand drawing):


 X <- O -> X
      |
      X

Effectively, do we need to allow for two directions from a single starting point, or just one? The first is pretty easy to code up since we just have lists to loop over:

grid = [{"type": "bond", "indices": [0, 1], "steps": [1.0, 1.01, 1.02, 1.1]},
        {"type": "dihedral", "indices": [0, 1, 2, 3], "steps": [110, 125, 130, 125]}]

Note the steps list is not necessarily monotonic or uniform (note this is both good and bad, do we need to check this?). Is this bidirectionally something we should consider?

loriab commented 5 years ago

though your use case probably has additional complications, here's some frequencies-from-Hessian code in python, in case it's handy: https://github.com/psi4/psi4/blob/master/psi4/driver/qcdb/vib.py#L307.

leeping commented 5 years ago

I think the grid points in each dimension should be sorted prior to starting the procedure. The results could be returned to the user in sorted order. If the user originally provided non-monotonic inputs, then they could recover the original order later on. At the moment I don't see the use case for non-monotonic values on the grid.

If a repeated value occurs in the steps field, an error should probably be thrown (as shown above, 125 is repeated in the dihedral angle).

Constrained optimizations converge fastest when you start close to satisfying the constraint, so I think it is best to start at whichever grid point is closest to the un-constrained optimized geometry. Most often this means the propagation should start from somewhere in the middle, though not necessarily the exact center. I also thought it'd be useful if the user could specify absolute vs. relative, where the grid points provided in relative are offsets from the un-constrained optimized value. For example:

grid = [{"type": "bond", "indices": [0, 1], "values": "relative", "steps": [-0.1, -0.05, 0, 0.05, 0.1]}]

leeping commented 5 years ago

Thank you @loriab , that is very helpful. :) The harmonic analysis we want should be rather simple and I don't foresee any complications with our use cases. If this code is part of Psi4, then I think we could simply run it on the Hessians downloaded from QCArchive and obtain the frequencies / vibrational modes that we need.

loriab commented 5 years ago

Ok, good! Yes, it's in psi4. Medium term, it'll move to the common driver repo, but it'll always be available in a pure-python psi4/qca repo. To keep the topic in one place, here's a test case. Ping me with questions if you ever want to use it.

dgasmith commented 5 years ago

@leeping Ok, that makes sense and brings together several other questions. Will progress in that direction.

dgasmith commented 5 years ago

From #141 and #152 I believe this is effectively finished. Thank you everyone for the comments and discussion on the structure of the service.

The current input structure looks like the following:

service = GridOptimizationInput(**{
    "gridoptimization_meta": {
        "starting_grid":
        "zero",
        "scans": [{
            "type": "distance",
            "indices": [1, 2],
            "steps": [1.0, 1.1]
        }, {
            "type": "dihedral",
            "indices": [0, 1, 2, 3],
            "steps": [0, 90]
        }]
    },
    "optimization_meta": {
        "program": "geometric",
        "coordsys": "tric",
    },
    "qc_meta": {
        "driver": "gradient",
        "method": "UFF",
        "basis": "",
        "options": None,
        "program": "rdkit",
    },
    "initial_molecule": mol_ret["hooh"],
})

ret = client.add_service(service)

I will be reaching out in about a week to see to begin integrating and trailing this code.

leeping commented 5 years ago

Hello Daniel,

Thank you so much for implementing this. It will be highly useful for our valence parameterizations.

Two questions: 1) What is the meaning of "starting_grid":"zero"? 2) Is the user able to specify whether to run an initial fully unconstrained optimization?

Lee-Ping

dgasmith commented 5 years ago

Right now we have two guesses:

zero starts at simply the initial zero grid point.
'relative' starts at the closest point to the input molecule.

I had forgotten about that case, I guess we should make add one more option that does this, perhaps optimized_relative?

leeping commented 5 years ago

Thanks for the explanation, there's a chance we may be thinking about this differently. My initial plan had two separate choices:

Choice LP-1: The user decides whether the service should run an un-constrained optimization as an initial step, or use the provided structure directly. Call this result structure "A".

Choice LP-2: Whether the grid points are "absolute" geometric parameters or "relative / displacements" with respect to structure A. For example, suppose grid points [-0.1 0.0 0.1] are provided for an interatomic distance. This should makes sense for relative but not absolute.

Choice DS-1: If I understand correctly, your zero starts at the first / leftmost grid point in each dimension and your relative starts at the closest point to the initial molecule, i.e. structure "A". I assumed that we would always be making the latter choice, because the performance of constrained geometry optimization is poor when starting from a structure far from constraint satisfaction. However, I also see the benefit of having this be a choice, so it's up to you.

dgasmith commented 5 years ago

I was thinking about this slightly differently and didn't quite realize that relative was with respect to optimized parameters. This makes sense and something we can work in pretty easily. I think we will make this a per-scan dimension option for lots of flexibility. Should we default to one option or make the user provide this explicitly every time?

leeping commented 5 years ago

Thanks, I think having this be a per-dimension option is a good idea.

I prefer for the user to need to make this choice every time. There are relative benefits and drawbacks to using absolute vs. relative, and it plays a role in defining the grid points.

dgasmith commented 5 years ago

Ok, this should be implemented in #161. This proved to be a good test and revealed several bugs in the ecosystem. I think we should be good to go now!

dgasmith commented 5 years ago

Closing this issue, please open new ones if GridOptimization needs additional tweaks.

MolSSI / QCFractal

GridOptimization #112