openforcefield / open-forcefield-tools

Tools for open forcefield development
MIT License
8 stars 6 forks source link

Updated PropertyCalculator API #14

Closed jchodera closed 6 years ago

jchodera commented 8 years ago

This is the beginning of the the PropertyEstimator implementation.

jchodera commented 8 years ago

Just trying to get travis working to make it easier to test as I go along.

davidlmobley commented 8 years ago

@jchodera - we (me with @mrshirts and @bmanubay ) expect to be trying to get some coding done relating to this in the next two days. Should we be picking up from here or do you have:

Thanks

jchodera commented 8 years ago

I still need to do a lot more work here, but have a clear calendar for the next several days. The most helpful things from your end are:

From https://github.com/open-forcefield-group/open-forcefield-tools/issues/8, I am targeting MassDensity as the first property that will be computable.

davidlmobley commented 8 years ago

@jchodera @mrshirts

I still need to do a lot more work here, but have a clear calendar for the next several days. The most helpful things from your end are:

  • What is the optimal way to compute experimental properties?

Did you have something specific you wanted to know? I think at this point we just need the basic framework from you for something very simple (mass density, or even gas phase single molecule simulations) and @bmanubay and @mrshirts can take care of implementing handling of other cases. It's the framework which is harder for us to handle because we don't have as big a view as you do.

  • What data should I use for test data (where calculations will not take much time)?

How about single molecule gas phase simulations of any of the AlkEthOH molecules with SMIRFF? Maybe methane and butane, or methanol...? Average energy and average bond and angle values would be a great start, though even just average energy would get us the framework we need to go on to the others (i.e. if I have the framework for average energy I can probably implement the bond and angle values with enough thought).

I think relative uncertainty is great when it is available but we also need to be able to override with an absolute value when needed. You could implement only relative for now and we could allow for override later by specifying an optional default argument.

I think we can implement a lot of these once we have the framework. If you did just average energy for now that would be a great start, but if you did average energy plus average bond length for bonds specified by a SMIRKS then I'd more easily be able to extend to handle angles and (ultimately, with Michael's insights relating to Fourier series representations) torsions.

From https://github.com/open-forcefield-group/open-forcefield-tools/issues/8, I am targeting MassDensity as the first property that will be computable.

That is correct, aside from the single molecule gas phase simulations that's the first place we're going. Note though it's much lower priority than the gas phase simulations, in that we won't use it until we've done the gas phase simulations.

davidlmobley commented 8 years ago

If you have any specific questions for us that need answering urgently, feel free also to text or call me as I'll be in the same place as @mrshirts today and tomorrow so we can sort things out quickly. He's back in Colorado on Wednesday.

davidlmobley commented 8 years ago

@jchodera

What is the minimal set of property computations I need for your reference equilibrium data coming from simulation?

After discussion with @mrshirts , we think we will also want standard deviation for bonds and angles as this gets the force constant rather than just the typical distance/angle.

HOWEVER, as noted above, I think once the basic framework is in case for something very simple we can probably easily handle the rest.

davidlmobley commented 8 years ago

@jchodera - is this to the point where you have enough here that some of it can be handed off to the rest of us? For example, I could have a go at the Substance structure to add checking that the IUPAC/SMILES are valid, that at least one is provided, etc. There's a lot we've done in SolvationToolkit that transfers straight across, though that envisions setting up a specific system so it starts dealing with the number of molecules, which this won't.

Also, any pieces of this that you think are far enough that you've completely defined what's going in to it and what's coming out, PLEASE hand off to us so we can code them up. There are definitely a couple people on my end who can help with coding, and I assume the same thing is true for @mrshirts . You're the key guy in terms of defining how the pieces need to work together, but you don't need to be the one writing everything.

Should I migrate "TO DO" items from the README.md to GitHub issues? For example, you've got this there:

infinite_dilution = Mixture()
infinite_dilution.addComponent('phenol', mole_fraction=0.0) # infinite dilution
infinite_dilution.addComponent('water')

TODO:

Probably we should have a separate issue to decide.

davidlmobley commented 8 years ago

@jchodera - did you also want help fixing your tests?

jchodera commented 8 years ago

No, I'd like to finish the preliminary implementation first, but this also requires that we finalize the basic API that you want me to implement first (which you've been changing during the implementation phase). That necessarily prolongs this process.

There's really almost nothing to the internals of the basic working model---it just a matter of getting the API right so the information flow will work out and it will be easily extensible.

I was traveling in Germany today, but I'll return to the issue discussing the recent API extensions for simulation properties to see if we have consensus yet on what needs to be implemented.

davidlmobley commented 8 years ago

Great to see this coming together, @jchodera !

davidlmobley commented 6 years ago

@Lnaden - this is the main one; John already did a bit of work here. See also the main README.md which details the API.

The API currently envisions only ThermoML datasets, IIRC, but as discussed in the workshop we would also want to accommodate datasets which can have other types of measurements, such as hydration free energies or other things we might have (e.g. host-guest binding free energies).

The things we want to get working initially are density and dielectric constant calculations, since how to do these well is already very clear so it's just an issue of the API.

jchodera commented 6 years ago

I've updated the API docs here and plan to merge this now so we can carry on with the implementation.