openforcefield / nistdataselection

Records the tools and decisions used to select NIST data for curation.
MIT License
3 stars 0 forks source link

[Sage Discussion]: Pure Compound Properties #13

Open ocmadin opened 4 years ago

ocmadin commented 4 years ago

Pure Compound Properties: Currently we are using density, hvap, static dielectric constant for fitting. For the sage release, are we going to stick with these properties or do we want to add/subtract properties (e.g. removing hvap, adding surface tension)?

ocmadin commented 4 years ago

Relevant notes from August discussion

mkgilson commented 4 years ago

I like the idea of dropping hvap in favor of surface tension, but I have a couple of questions:

  1. Are there enough surface tension data to make up for the lost hvap data?
  2. There is some risk that replacing hvap with surface tension will lead to unforeseen problems or quirks with the Sage release. Could we do some kind of trial run to suss this out before committing to such a potentially big change in a major release?
ocmadin commented 4 years ago
  1. Are there enough surface tension data to make up for the lost hvap data?

My recollection is that pure hvap data was one of the most scarce forms of data in ThermoML, and there was a fair bit (~5x) more surface tension data. Most of the DIPPR data we are asking for will also have surface tension measurements (as well as hvap).

  1. There is some risk that replacing hvap with surface tension will lead to unforeseen problems or quirks with the Sage release. Could we do some kind of trial run to suss this out before committing to such a potentially big change in a major release?

Definitely agree. If we are going to drop hvap data, we'll need to do testing to make sure that it doesn't cause big problems. Would be useful to identify what the experiments we need to do for a trial run would be. Alternatively, we could just add surface tension for Sage and think about removing hvap at a later date.

ocmadin commented 4 years ago

Relevant paper on using surface tension in fitting from @yudongqiu @leeping: https://pubs.acs.org/doi/full/10.1021/acs.jpcb.9b05455 could be a useful starting point.

davidlmobley commented 4 years ago

I also am in favor of either dropping DeltaHvap (to replace with other things) or, if we can afford the time, doing a comparison study of seeing what happens with/without it.

Surface tension data appeals to me; @leeping seems to have had good success with it. But I do like the idea of a trial run.

What about heat of mixing?

Regarding dielectric constants -- we did not fit to these for Parsley and I believe they are not on the short-term plan @simonboothroyd is working from; it does not make sense to include these in the fit, to me, until we are also fitting BCCs. If we are fitting these for Sage then I am OK with including dielectric constants, but not otherwise. For some groups, like alcohols, the charges NEED to change to get dielectric constants right.

Are there other good data types (aside from surface tension) that we can easily calculate precisely that will do a good job of constraining LJ?

davidlmobley commented 4 years ago

@jchodera also spoke up in favor of dropping Hvap on Slack the other day; not sure if he'll chime in here.

mrshirts commented 4 years ago

We need energetic data of the form dG/dT to define the free energy function as a function of G(P,T). (Although dG/dT = -S, dbetaG/dbeta= beta H, so we are getting roughly equivalent data. If we don't have that righrt Enthalpy of mixing has a reference point issue: we could make everything too cohesive or not cohesive enough even if we get the mixing enthalpy right. I think replacing dH_vap is an important point, but I think there's a moderate amount of science for something to properly replace it. We are not entirely sure if surface tension is the right replacement, and it would be good to do the proper experiment.

I think we could possibly get away with just dHvap of less polar molecules, which should have fewer issues (though there are a couple of other quantum effects that differ between liquid and vapor). Then we have essentially a general reference point for overall cohesion. We do have to be a little bit careful how dH_vap is calculated to make sure it is consistent with experiment - for example, are there contributions from nonideality of the vapor, which are more relevant for lower boiling point compounds.

I think it might take a bit of time to work out a standard procedure for surface tension (though I am very much up for benchmarking surface tension), and I don't know that we have time for sage.

Since this is the first optimization of the Lennard-Jones, I think that starting simple has some advantages.