MBAR vs PCFR - Githubissues

ramess101 commented 7 years ago

In Issue #4 I began discussing the PCFR approach in comparison with MBAR. I think this deserves its own issue. Comparing PCFR with MBAR is the first manuscript that I am working on. These two methods have some strong underlying similarities, namely, trajectories from a reference system can help predict properties for a target system. PCFR is limited to predicting U and P but, fortunately, with ITIC that is all we need. Furthermore, PCFR is really only applicable when changing the nonbonded potentials. So the comparison of MBAR and PCFR is strictly focusing on the prediction of U and P for different nonbonded potentials.

The results section of the manuscript provides a comparison between MBAR and PCFR for two different tasks. The first is for the local domain of the reference system. MBAR appears to be superior in this regard because there is sufficient phase overlap (Neff >> 1) while PCFR has the wrong trends. By contrast, when extrapolating for away from the reference system PCFR appears to be better. This is because PCFR actually changes the trajectories whereas MBAR reweights the trajectories. But if none of the trajectories are viable for the target system there is no possible reweighting that will give proper U and P.

Here I want to demonstrate this second point. I have included parity plots for the predicted U, P (and Z) for MBAR and PCFR relative to the direct simulations. The reference system is TraPPE and the target is Potoff.

Notice that PCFR has better energy agreement. The pressures are a little bit better in PCFR as well.

The improvement for PCFR is best appreciated by plotting the VLE from MBAR-ITIC and PCFR-ITIC compared to direct simulations (again, the target is Potoff and the reference for PCFR and MBAR is TraPPE):

Notice that rhol looks much more similar for PCFR and that Psat is also significantly better. This is convincing evidence that PCFR can be useful when extrapolating to different parameter sets.

Obviously this is a single case, but I am currently processing large amounts of data and developing parity plots for a wide range of parameter sets.

mrshirts commented 7 years ago

Thanks for all the great data! Could you be a bit more specific about what you mean by "The first is for the local domain of the reference system. MBAR appears to be superior in this regard". I want to have a good sense for the domain of applicability.

I'd love to introduce mapping during the paper as how one modify the configurations the "right" way, with PCFR presented as a way to approximate it with some nice advantages. But that's a writing thing. I'm not entirely sure if the configuration mapping can be used easily here, since you are doing isochoric simulations, and scaling the sigma requires changing the volume - you certainly can set up a correspondence between systems with different sigmas and different volumes, since you are doing a volume scan, but the bookkeeping might be a pain.

ramess101 commented 7 years ago

Could you be a bit more specific about what you mean by "The first is for the local domain of the reference system. MBAR appears to be superior in this regard". I want to have a good sense for the domain of applicability.

I am still working on this. Specifically, I am running direct simulations over a wide parameter space and determining the deviations for MBAR and PCFR. When those simulations finish I will post the results.

As for now, I have plots of the Neff from MBAR in Issue #4. I am repeating those figures here.

These are for a single reference (TraPPE), and keeping lambda at 12:

Neff is > 100 for a wide range of epsilon and a range of about 0.005 nm for sigma.

However, for lambda of 16 the Neff < 10 for the region of interest (higher epsilon and sigma):

Again, this just gives you an idea as to why certain regions may be inaccurate. But until I have all the direct simulation results to compare with I cannot say anything quantitatively.

ramess101 commented 7 years ago

Here are the results from the comparison between MBAR and PCFR for the LJ 12-6 parameter space. Recall (see previous comment) that the parameter space that we explored is from 88-108 K and 0.365-0.385 nm. There is a single reference system, namely, the TraPPE model at 98 K and 0.375 nm (right in the middle). The results are presented as parity plots and residual plots for U, P and Z.

First, here are the results for the entire parameter space:

These figures were a bit discouraging. But recall that we are covering a fairly wide region of parameter space. A lot of these parameter sets would yield very poor VLE curves. For example, what if we constrained our results to just those with constant sigma (0.375):

Clearly, for constant sigma both methods are much better. PCFR still has some systematic trends but MBAR appears to perform quite well. So I decided to focus on the range of applicability of MBAR. This is the performance for sigma from 0.373-0.378:

MBAR appears to perform pretty well over this range. To help quantify when we should trust MBAR I used different cutoffs for Neff.

First, let's plot the deviation in each property as a function of Neff:

Now, let's look at the parity plots and residual plots with a color scale corresponding to log10(Neff). First, with all the data (Neff > 0):

Clearly, all of the poor predictions correspond to log10(Neff) < 1, i.e. less than 10 effective samples. The reciprocal statement is not true. In other words, there are some good predictions that have Neff < 10 but the vast majority have very poor predictions.

Here I have changed the cutoff for Neff to be greater than 10 to demonstrate this point:

These images suggest that Neff > ~100 could be a better cutoff. It all depends on the level of accuracy that is required. I use Neff > 50 as the cutoff for these plots:

These figures suggest that Neff > ~50 is probably good enough. In other words, even the lowest Neff points on these plots have good parity plots and deviations.

Recall this image from my previous comment:

Notice that the lowest contour is an Neff_avg of 150. This suggests that we should be able to get very accurate predictions for that entire region of parameter space. That being said, the deviation in P and Z is only one order of magnitude less than P and Z (i.e. ~10% uncertainty). This is pretty substantial. For initial parameterization it is probably not a concern but for uncertainty quantification we will want this to be smaller.

The cutoff of Neff > ~50 is problematic for converting LJ to Mie 16-6. Recall this figure for lam = 16:

With Neff_avg < 10 for every point in this parameter space we can understand why MBAR would struggle to extrapolate to lam = 16.

mrshirts commented 7 years ago

These are beautiful results, Rich. Your use of color and N_eff really illustrates the point well. You can put all of these results in supporting information, with illustrative ones in the paper.

One question that wasn't so clear to me is how many states you are drawing on. Is this the scenario when you just are using one reference simulation? So the multisimulation case will be presented separately (After resolving, as you do so well here, the single simulation situation).

ramess101 commented 7 years ago

These are beautiful results, Rich. Your use of color and N_eff really illustrates the point well. You can put all of these results in supporting information, with illustrative ones in the paper

I agree that these figures can provide a lot of insight. I was thinking that there are four types of figures I could include (probably in the paper itself):

Parity plots to compare MBAR and PCFR over entire parameter space but with 0.374-0.376 nm as a subset
Parity plots for MBAR with color map of Neff
Residual plots for MBAR with color map of Neff
Residual plots for MBAR with respect to Neff

We would want one figure of each for both U and Z. The question for figures 2 and 3 is whether or not to include the entire range of Neff or just Neff > 10 or Neff > 50. I think figure 4 would justify why we use a cutoff for Neff for plots 2 and 3. I could include any additional plots (such as residual plots for PCFR) in the supporting information. I think I could also make contour plots showing the average deviation over the entire parameter space to demonstrate the region of applicability.

One question that wasn't so clear to me is how many states you are drawing on. Is this the scenario when you just are using one reference simulation? So the multisimulation case will be presented separately (After resolving, as you do so well here, the single simulation situation).

Yes, this is the case where we use a single reference at the TraPPE parameters (98 K and 0.375 nm), which is strategically right in the middle of the parameter space since we go from 88-108 K and 0.365-0.385 nm. I was thinking that I could use Neff as a criterion for when we need additional references. For example, since Neff < 50 for sigma < 0.372 nm and for sigma > 0.38 nm, we could add two reference systems at 0.372 and 0.38 nm. With those three references I expect MBAR to perform very well for lambda = 12 over a wider range of sigmas. It is possible that multiple sigmas would help for lambda = 16 as well. But yes, the multisimulation case will be presented after showing the single simulation situation.

ramess101 commented 7 years ago

Here are some contour plots that help visualize the area of predictability:

Clearly PCFR struggles for P and Z with respect to sigma. For U it is a bit different since PCFR can actually predict U accurately for a wide range of sigma whereas MBAR needs to be within the 0.372-0.378 sigma range (i.e. Neff > ~50).

jrelliottoh commented 7 years ago

Just an editorial note: I think you should use the same values for the contour magnitudes when comparing MBAR to PCFR.

ramess101 commented 7 years ago

Just an editorial note: I think you should use the same values for the contour magnitudes when comparing MBAR to PCFR.

I certainly agree. Thanks for the input.

I have updated all the figures to have consistent contour lines.

ramess101 commented 7 years ago

Another improvement to the figures is to sort the MBAR values by the Neff so that these "come to the front" of the plot. This helps demonstrate that there are no "high Neff" points that are hiding behind the other "low Neff" points.

For example:

ramess101 commented 7 years ago

I have also investigated the "Constant PCF" approach. Constant PCF means that we do not rescale the PCF at all. This is identical to using MBAR but with the weights of each configuration set to 1. Recall that one advantage of this approach is speed-up compared to MBAR (only really important when basis functions are not possible). It also represents a commonality between MBAR and PCFR. Originally I assumed that PCFR was superior to constant PCF, however, the figures below suggest that constant PCF might be better for P and Z.

The primary result is that a constant PCF shows very similar trends to MBAR. It's range of applicability is not as good as MBAR, i.e. the RMS increases more rapidly with respect to sigma. However, this approach has better trends for P and Z than the PCFR approach (where we scale the PCF). By contrast, the constant PCF is worse than PCFR for U. This suggests that a hybrid approach that utilizes PCFR for U and constant PCF for P and Z could be the best PCF method. To truly test the usefulness of PCFR we will need to investigate how well this works for extrapolating from lam = 12 to lam =16, since we know that MBAR struggles in this regard.

ramess101 commented 7 years ago

The results are in for the Mie 16-6 scan of the parameter space. The figures I am about to show are when you try to use the TraPPE 12-6 model to predict the Mie 16-6 parameter space of 108-128 K for epsilon and 0.365-0.385 nm for sigma. Recall that the average Neff is very low for MBAR in this case. As expected, MBAR struggles in this task. Note that on some figures I have included a "recommended" set which corresponds to using MBAR when you have Neff > 30 otherwise using an average of MBAR and PCFR.

Note that the only state that has a decent number of Neff is the lowest density on the supercritical isotherm. So, basically, changing the repulsive term causes very poor overlap in MBAR for anything other than a gas. To visualize this I have plotted the states sampled with points proportional to the percentage of systems that had Neff > 30 and Neff < 2:

ramess101 commented 7 years ago

I have now repeated the MBAR analysis using 9 reference sigma values (constant epsilon = 98 K, lambda = 12) between 0.365-0.385 (every 2.5 pm). With this spacing we see very good predictability over the entire parameter space.

Here are the parity plots and residuals for Udep and Z:

Notice that there are still a couple points that have Neff < 30 (some < 10). These points show the worst agreement. The figure below shows that these points with very few samples correspond to the high density saturated liquid states, but less than 1% of the 441 parameter sets suffer from this:

Most importantly, we now get the correct contours for RMS of rhol.

Here is direct simulation:

Here is MBAR single reference (at middle):

And here is MBAR with 9 references:

Notice that the 9 reference approach has very similar contours to the direct simulations. This is best observed by plotting the contours on top of each other:

Vapor pressure looks very good as well. RMS is not the best descriptor though since Pv has a very wide range in magnitudes.

ramess101 / MBAR_ITIC

MBAR vs PCFR #7