Open ramess101 opened 7 years ago
Hi, Rich-
Sorry for taking so long to reply; This took some time with concentration which ended up not being viable at the workshop.
Just got a little bit of time, will continue on the flight back to Denver. One thing I noticed so far that might be useful. If the simulation is NVE, then the Boltzmann weights are exp(-beta U), not exp(-beta H). The enthalpy can be calculated as +
V, but U + PV should not be in the Boltzmann weight, just U. Also, the P that is printed out by simulations is not a thermodynamic observable, but rather a quantity that AVERAGES to P over long times. The individual frames are not thermodynamically relevant (just mechanical observables).
Looking at the magnitudes of the contributions, this might resolve the issues, but I will keep looking
Michael,
Not a problem to be slow in replying, I have plenty to keep me busy :-)
Using a Boltzmann weighting of exp(-beta U) does appear to have fixed it. I know that I tried using betaU instead of beta(U+PV) at one point but I did not see any improvement because I was only using "MBAR" instead of "MBAR_alt". Now, using MBAR_alt with beta*U I get very good prediction for Mess-UP (better than "rerun") and it is the best for Potoff (even with the poor overlap). So that seems to have resolved my issue. And that suggests that "MBAR_alt" is the correct way to calculate the expectation values. However, this does raise an issue with the idea that we had of using a weighting of 1 when the effective number of samples is too low. Since the Potoff MBAR_alt is better than "rerun", even though Potoff has N_eff of 1, it does not appear that it is beneficial to use a weighting of 1. So it looks like using the scaled distances (like my PCF-PSO approach uses) might still be necessary if using a weighting of 1.
I am trying to understand the importance of the distinction you made for P. Are you just saying that to get V? If so, isn't the difference pretty small between <U + PV> and + V?
Hi, Rich-
Looks like there were two issues. One is the U +PV vs U issue, discussed in the earlier email.
OK, that does appear to be one of the issues.
➢ 1. Is the computing of expectation values done properly on line 112?
The MBAR_alt is the correct result, and I’ve sketched out how to get it using computeExpectations in a Ethane_MBAR_two_states_MRS.py script that I’ve added as a PR to your repository.
For the other questions, perhaps look over the code that I’ve checked in and see if things fit together better now? The results I got were:
effective sample numbers [ 1001. 1.00460819 985.36394906]
For state 0: MBAR estimate for U = -5522.483 +/- 0.873 Direct average = -5522.483 +/- 0.873 (obviously the same) Weights=1 estimate = -5522.483 +/- 0.873 (obviously the same) For state 1: MBAR estimate for U = -5876.616 +/- 0.002 Direct average = -6223.083 +/- 0.902 Weights=1 estimate = -5727.294 +/- 1.568 For state 2: MBAR estimate for U = -5554.139 +/- 0.889 Direct average = -5555.726 +/- 0.857 Weights=1 estimate = -5550.654 +/- 0.878
And I think everything makes sense now.
For kicks I actually wrote the code to scale the weights to get a predetermined number of effective samples. For example, to get 50 effective samples, you need to be 0.087 of the way from weights= 1 to MBAR, and you get -5818.121+/- 7.081, but the statistical uncertainty is now mostly correct. You need to solve a nonlinear equation, but it’s trivial since it’s doesn’t involve rerunning MBAR each time.
➢ Finally, based on the information you provided in your previous emails, I have attempted to use smaller systems sizes (50 molecules rather than 400) and longer simulation runs (20 times longer). But the results I obtained were essentially the same. That is, Potoff still has around 1 effective sample and MBAR_alt is better than Rerun which is better than MBAR.
That seems odd. Can you check again with the revised estimators? It could be that Potoff is still way off – N_eff = 1.3 has a lot better number of effective samples than 1.000001, but still isn’t good enough to do anything with. (N=1.00001 indicates no other state are anywhere near . . . it makes me wonder if that N_eff-1 might be the most easily understood measure of overlap, since the minimum is 1). We could also calculate the phase space overlap more quantitatively using mbar.computeOverlap (which requires U_00, U_01, U_10, and U_11) – documented in the 2015 paper.
OK, now looking at this.
Ø Using a Boltzmann weighting of exp(-beta U) does appear to have fixed it. I know that I tried using betaU instead of beta(U+PV) at one point but I did not see any improvement because I was only using "MBAR" instead of "MBAR_alt". Now, using MBAR_alt with beta*U I get very good prediction for Mess-UP (better than "rerun") and it is the best for Potoff (even with the poor overlap). So that seems to have resolved my issue. And that suggests that "MBAR_alt" is the correct way to calculate the expectation values.
Great, that’s what I found + I added more explanations about how to use MBAR more directly (and yes, MBAR_alt).
Ø And that suggests that "MBAR_alt" is the correct way to calculate the expectation values. However, this does raise an issue with the idea that we had of using a weighting of 1 when the effective number of samples is too low. Since the Potoff MBAR_alt is better than "rerun", even though Potoff has N_eff of 1, it does not appear that it is beneficial to use a weighting of 1. So it looks like using the scaled distances (like my PCF-PSO approach uses) might still be necessary if using a weighting of 1. Ø
That may be the case. Note that MBAR_alt probably has very large statistical error (you can run bootstrap sampling on that – I will try that on the bus home and recheck in the data), so there might be advantages to weight=1 because of lower statistcal error. And it’s not like either are that close, so “MBAR is better” may not be generalizable.
Just to make sure I’m not seeing things, can you check if PSC-PSO w/o scaling is the same as “MBAR with weights 1” (i.e. just averaging U_01?)
Note that MBAR is nice because it doesn’t require assuming pair functional form, can be used for volumes and arbitrary observables, but I can easily see a situation where PSC-PSO is essentially a much faster approximation to “MBAR with scaled weights + coordinate scaling”, and there could be a family of such approximations.
Ø So it looks like using the scaled distances (like my PCF-PSO approach uses) might still be necessary if using a weighting of 1.
Right. The equivalent in the “no approximations” limit, before rerunning, scale all the intramolecular coordinates and add a jacobian term (I can document this more, probably Fri or over the weekend).
Michael,
Thanks for all the valuable feedback. MBAR is working as it should and the bootstrap uncertainties were very useful for the Potoff model that has N_eff ~1.
I have included a few additional files for running the PCF-PSO approach. The "PCF_PSO_two_states" code is similar to my more general PCF_PSO code but I tried to make it more explicit for the simple case that we are looking at here. Specifically, you will see how we take the reference PCF and can manipulate that PCF or use it directly to predict U. I have numbered several different PCF-PSO methods depending on different assumptions. Here is a summary of the results this code generates:
For state 0: MBAR estimate for U = -5522.483 +/- 0.873 Direct average = -5522.483 +/- 0.873 Weights=1 estimate = -5522.483 +/- 0.873 PCF-PSO, general = -5522.483 PCF-PSO, scaled = -5522.483 PCF-PSO, predicted zeroth order = -5522.483 PCF-PSO, scaled zeroth order = -5522.483 PCF-PSO, hybrid of scaled methods = -5522.483 For state 1: MBAR estimate for U = -5876.616 +/- 0.002 Direct average = -6223.083 +/- 0.902 Weights=1 estimate = -5727.294 +/- 1.568 PCF-PSO, general = -5728.899 PCF-PSO, scaled = -5906.148 PCF-PSO, predicted zeroth order = -6495.207 PCF-PSO, scaled zeroth order = -6525.827 PCF-PSO, hybrid of scaled methods = -6215.988 For state 2: MBAR estimate for U = -5554.139 +/- 0.889 Direct average = -5555.726 +/- 0.857 Weights=1 estimate = -5550.654 +/- 0.878 PCF-PSO, general = -5550.659 PCF-PSO, scaled = -5550.659 PCF-PSO, predicted zeroth order = -5563.797 PCF-PSO, scaled zeroth order = -5563.797 PCF-PSO, hybrid of scaled methods = -5557.228
Note that I do not have errors for the PCF-PSO approach because that requires bootstrapping from the configuration files to get different PCFs, which is more labor intensive that I think this exercise merits right now.
You will be pleased to see that the PCF-PSO "general" approach (without any scaling, just using the PCF directly) yields nearly identical results to the Weights=1 estimate. The "scaled" results (i.e. scaled by sigma) are not as good as MBAR for Mess-UP but appear to be better for Potoff (i.e. when N_eff is very low). This confirms what we discussed previously about when PCF-PSO could work better (i.e. when overlap is very poor). "Predicted zeroth order" is where we actually use the ratio between the predicted zeroth order PCF (i.e. exp(-U/T)) to allow the PCF to change for a different potential model. This works best when extrapolating to very different models (i.e. different sigmas, lambdas, epsilons). "Scaled zeroth order" is the same as "predicted zeroth order" but where we take the exp(-U/T) ratio based on a constant r* (r/sigma). This works better when the two sigmas are very different. Finally, "hybrid of scaled methods" is just the average of "scaled" and "scaled zeroth order" because I have observed that the two methods have opposite trends, i.e. when one over predicts the other under predicts. Although this is pretty empirical, it clearly results in the best prediction of all methods (including MBAR) for Potoff (state 1). It is also equally accurate to MBAR for more similar states (state 2). Obviously, a big weakness of PCF-PSO approach is that it can only take information from a single reference state whereas MBAR can use configurations from multiple reference states. Therefore, the possible accuracy of PCF-PSO reaches a certain limit while MBAR can become more and more accurate with more sampling of parameter space.
I will continue to look into the other points you addressed and get back to you later.
Thanks again
Obviously, a big weakness of PCF-PSO approach is that it can only take information from a single reference state whereas MBAR can use configurations from multiple reference states. Therefore, the possible accuracy of PCF-PSO reaches a certain limit while MBAR can become more and more accurate with more sampling of parameter space.
And hopefully, starting from MBAR would give a bit more of a clue how to make approximations later, since it's applicable to all systems (i.e, not just pair potentials, etc).
"Predicted zeroth order" is where we actually use the ratio between the predicted zeroth order PCF (i.e. exp(-U/T)) to allow the PCF to change for a different potential model.
Could you explain "zeroth order" a bit more? I'm not sure I followed. I understand how exp(-U/T) is the zeroth order, but I'm not sure what "use the ratio . . . to allow the PCF to change" means.
Note that I do not have errors for the PCF-PSO approach because that requires bootstrapping from the configuration files to get different PCFs, which is more labor intensive that I think this exercise merits right now.
Agreed.
Weights=1 estimate = -5727.294 +/- 1.568 PCF-PSO, general = -5728.899
Yeah, differences here are probably all in the errors inherent in histogramming and aren't that important.
And hopefully, starting from MBAR would give a bit more of a clue how to make approximations later, since it's applicable to all systems (i.e, not just pair potentials, etc).
Yes, MBAR definitely has some flexibility that PCF-PSO does not have.
Could you explain "zeroth order" a bit more? I'm not sure I followed. I understand how exp(-U/T) is the zeroth order, but I'm not sure what "use the ratio . . . to allow the PCF to change" means.
So the "zeroth order" PCF is just exp(-U/T). We use the ratio between exp(-U(model_0)/T) and exp(-U(model_1)/T) to give us an idea as to how much the PCF should actually change when going from model_0 to model_1. For example, a larger epsilon should increase the height of the first peak. Using the ratio exp(-deltaU/T) (where deltaU = U(model_0) - U(model_1)) we can increase the height of the first peak. In addition, if the repulsive barrier changes (i.e. lambda changes from 12 to 16 when going from TraPPE to Potoff) the zeroth scaling can account for the change in shape of the initial increase in the PCF. At long distances the ratio becomes 1 and has no effect. I will include plots in Ethane_MBAR_PCF_PSO_comparison to demonstrate what this looks like for Potoff.
Can you write this out in equations? For example, do you rescale the peaks of the empirical PCF by exactly the ratio of the zeroth order PCF? What if the ratio isn't the same everywhere?
Yeah, differences here are probably all in the errors inherent in histogramming and aren't that important.
Agreed. Although I do try to account for that in my code (U_error determines the difference between the ensemble and histogram averages). The error might actually be that I assume U_error (i.e. U_ens - U_PCF) is constant between states 0, 1, and 2. But at any rate, this is not a large concern.
Can you write this out in equations? For example, do you rescale the peaks of the empirical PCF by exactly the ratio of the zeroth order PCF? What if the ratio isn't the same everywhere?
So the rescaling actually occurs at every value of r. Since the ratio is very dependent on the distance, i.e. the ratio is greatest (or smallest) around the peak and then goes to 1. Here is the equation for predicting a PCF for a new model:
PCF_predicted_zeroth_order(r) = PCF_ref(r) * exp(-deltaU(r)/T)
Where PCF_ref is the PCF obtained from state 0. deltaU is the difference in the intermolecular potentials of states 0 and 1 (or 2) at a given distance. Specifically, deltaU(r) = U_1(r) - U_0(r) = Mie(eps_1,sig_1,lam_1,r) - Mie(eps_0,sig_0,lam_0,r).
Note that I believe I had the order of subtraction for deltaU wrong in my previous comment. Also, note that RDF_hat_calc in my code does not use these exact equations for some computational reasons (i.e. dividing by zero, etc.). But it gives the same result.
Finally, you can also use a scaled distance which gives:
PCF_scaled_zeroth_order(r) = PCF_ref(r) exp(-deltaU(r)/T)
Sorry for the confusing nomenclature.
I will include plots in Ethane_MBAR_PCF_PSO_comparison to demonstrate what this looks like for Potoff.
OK, I have added a few lines of code to generate some plots so you can see how well it predicts the PCF for the Potoff model from the TraPPE model.
Here is what the images should look like:
Clearly at close distances the "Predicted ..." and "Scaled predicted..." agree with Potoff more closely than TraPPE. However, the peak is slightly over predicted (this is why the hybrid averaging approach works so well). At long distances the "Predicted ..." PCF is identical to the TraPPE PCF (the "Scalled predicted..." is slightly shifted).
Finally, based on the information you provided in your previous emails, I have attempted to use smaller systems sizes (50 molecules rather than 400) and longer simulation runs (20 times longer). But the results I obtained were essentially the same. That is, Potoff still has around 1 effective sample and MBAR_alt is better than Rerun which is better than MBAR.
That seems odd. Can you check again with the revised estimators?
Alright so now using the correct MBAR equations I get a pretty good prediction of U for state 1. I resimulated the TraPPE model using only 50 molecules (1/8th as many as before). The predicted internal energy for Potoff is now -6159 kJ/mol (on the same 400 molecule basis) which is in very close agreement with the actual value of -6223 kJ/mol. It is interesting that the N_eff value is still around 1.001.
I have attached a figure to show how using the smaller system sizes we get larger fluctuations which allows the internal energy to overlap more with the direct simulation internal energies of model 1. Note that Model 1 simulated used 400 molecules whereas Model 0 and Model 1 Rerun used only 50 molecules. This explains the large fluctuations in the blue and orange curves but not the green curve.
As far as using longer simulation runs, for the 400 molecule system even when you use a simulation that has 20 times as many samples (4 times as long with 5 times the frequency of sampling) you get the same prediction as the shorter simulation. I think this is because the fluctuations are so small for the 400 molecule system that it is never going to have a state that is similar to the direct simulation. The figure below demonstrates this point (note that Model 1 simulated is still the shorter simulation run).
OK, that makes sense now. I assume you've normalized by the 1/8 factor, correct? (I would think so, since otherwise they would not be on the same scale). And yes -- to get better behavior, one might need to run MUCH MUCH longer if the overlap was poor.
So the rescaling actually occurs at every value of r. Since the ratio is very dependent on the distance, i.e. the ratio is greatest (or smallest) around the peak and then goes to 1. Here is the equation for predicting a PCF for a new model: PCF_predicted_zeroth_order(r) = PCF_ref(r) * exp(-deltaU(r)/T)
Great, I'll think some more about what this means in terms of other ways of thinking about the statistical mechanics. In my experience with playing around with zeroth order appoximations (see the papers on optimal alchemical paths), then they work REALLY well on the repulsive wall, OK near the well, and fail outside -- but the ratios is interesting.
OK, that makes sense now. I assume you've normalized by the 1/8 factor, correct? (I would think so, since otherwise they would not be on the same scale).
Yes the 50 molecule simulations are adjusted to be on the same scale as the 400 molecule simulation.
Great, I'll think some more about what this means in terms of other ways of thinking about the statistical mechanics. In my experience with playing around with zeroth order appoximations (see the papers on optimal alchemical paths), then they work REALLY well on the repulsive wall, OK near the well, and fail outside -- but the ratios is interesting.
The reason why the zeroth order approximation is so useful is because the largest error in U (and also in P if calculated using a PCF) is from the short distances. In other words, if we were to use the actual Potoff PCF at close distances (0.3 - 0.45 nm) and the TraPPE PCF for everything else we would get a much more accurate prediction of U_Potoff than when we use just the TraPPE PCF to predict U_Potoff. This is due to two factors. First, short distances actually contribute significantly to the integral for U. Quantitatively, the running integral is about 80% of the total integral by around 0.45 nm. Second, the PCF differs the most at close distances when comparing two different models (compare the PCFs for "Actual Potoff" and "Reference (TraPPE)"). After 0.45 nm the PCFs are very similar, especially when compared on a reduced scale (r*=r/sigma). So when we use the zeroth order approach we dramatically improve the PCF prior to the first peak (see the second plot). Unfortunately, the peak height is still not perfect but it is typically closer. And, like I mentioned, since we are using the ratio of zeroth orders we get the same PCF as the reference PCF for long distances regardless of the model.
I'm thinking that scaling the coordinates and then reevaluating the energies of the configurations will actually get both the rescaling of the PCF and the zeroth order correction. Here's my thinking; the rescaling of coordinates puts everything at the proper distance that they would with the sigma changes. However, with rescaling the PCF, it just places the improperly weighted (for the "new" state) out at the correct distance, but it doesn't give it the right weight. rescaling by the zeroth order approximation ratio exp(-beta U_1)/exp(-beta U_0) is the same as adding U_1(r)-U_0(r) to all the interaction pairs. If we rescale the coordinates and reevaluate the energy, then the new energies, which WILL get propagated into the weights, will have that additional U_1(r)-U_0(r) energy which will get both added to the Boltzmann factors, and then will contribute to the
Conclusion: applying coordinate mapping by rescaling the COM of the molecules by the ratio of sigmas will put the particles near the minimum well distance of the new potential, and evaluating the energies with the new epsilons should give better weights, and properly contribute to the energies (i.e. will be both of the U in \int U(r) exp(-betaU(r)), rather than just the U outside of the exponential.
Does this make sense?
That does make sense. I think you are correct that scaling COM while using MBAR will not only put everything at proper distances so that you have good configurations to sample from but MBAR will actually weigh them appropriately so that the estimate will be reliable. Effectively, MBAR could give us the correct PCF but we will just cut out that middle step. And I think looking into the similarities between using the ratio of zeroth order corrections and MBAR could be insightful. So I think it would be very interesting to test out your hypothesis. But how exactly can we rescale the COM rapidly with Gromacs? Is there a way to edit the trajectories in the .trr file that is used in the "rerun"?
Also, in your previous comment I think you forgot a word/phrase at the end of the first paragraph, "and then will contribute to the...".
Easiest thing to do would probably be to use a package like mdtraj to read in the files and adjust the coordinates.
I just did a pull request with a script that does something similar (remaps to get better overlap when changing T and P). You'd have to modify things a bit more to just adjust the COM of each molecule, and not all of the coordinates, but should be pretty straightforward.
So I guess I am actually confused as to what we mean by "rescaling the COM" in an NVT context. If this was NPT I would understand this as scaling the box volume and the COM x,y,z by the ratio of sigmas. But if the volume is constant how do we rescale the COM coordinates? What we really want is to rescale the distances between all the molecules. So are we referring to a Cartesian coordinate rescaling (x,y,z)? Or are we actually saying that we would find the COM distances between interacting molecules (r) and rescale those values? The latter makes a lot more sense to me (r, not x,y,z).
Ah, good point if doing NVT. This approach could only be done between different phase diagram points, between V for molecule 1, and V for molecule 2. where V = (sigma*/sigma)^3 V. You can't really come up with an easy mapping between the phase spaces between molecules of different size in the same volume, because their phase spaces are so different -- large molecules are going to be necessarily squeezed together, which is going to change the equation of state, so it's not so good a match anymore.
If sigma is changing a lot but R_min isn't (because of the change of the functional form), one could certainly rescale by R_min instead of sigma, of course.
If two potentials have different R_min's, then it seems unlikely they will have the same pressure at a given V, since they are fundamentally bigger.
If one wants to map between different volume (as you mentioned before, a weakness of the PCF approach), then NPT simulations might be necessary.
A few more directions to go here, but let me know your thoughts.
Yeah, the NVT system is a problem in that sense. But since we will likely want to do some NPT simulations the COM scaling is still something to have in our tool belt. Although I am still trying to rap my head around why we can scale the RDF so easily but scaling the trajectories doesn't work.
As for using r_min instead of sigma. I originally thought that using r_min would be better for scaling the RDF since the first peak is located close to r_min, not sigma. But for some reason it actually gives slightly worse prediction. Here are the values for using rmin as the characteristic distance with PCF-PSO:
For state 1: MBAR estimate for U = -5876.616 +/- 0.002 Direct average = -6223.083 +/- 0.902 Weights=1 estimate = -5727.294 +/- 1.568 PCF-PSO, general = -5728.899 PCF-PSO, scaled = -5906.148 PCF-PSO, predicted zeroth order = -6495.207 PCF-PSO, scaled, predicted zeroth order = -6525.827 PCF-PSO, hybrid of scaled methods = -6215.988 PCF-PSO, scaled by rmin = -5972.620 PCF-PSO, scaled by rmin, predicted zeroth order = -6622.662 PCF-PSO, hybrid of scaled by rmin methods = -6297.641 For state 2: MBAR estimate for U = -5554.139 +/- 0.889 Direct average = -5555.726 +/- 0.857 Weights=1 estimate = -5550.654 +/- 0.878 PCF-PSO, general = -5550.659 PCF-PSO, scaled = -5550.659 PCF-PSO, predicted zeroth order = -5563.797 PCF-PSO, scaled, predicted zeroth order = -5563.797 PCF-PSO, hybrid of scaled methods = -5557.228 PCF-PSO, scaled by rmin = -5551.094 PCF-PSO, scaled by rmin, predicted zeroth order = -5568.836 PCF-PSO, hybrid of scaled by rmin methods = -5559.965
Notice that the rmin approach always comes out more negative than the sigma scaled approach. Like I said, I would have thought rmin would perform better.
So essentially the conclusion I am seeing is that if we want TraPPE and Potoff to overlap enough to observe similar energies we need to use a small system. Then we wouldn't even need to worry about rescaling with MBAR because the overlap is good enough.
This is what scaling by rmin looks like when predicting the RDF. Previously I had an error in the script that plotted the wrong figure for scaled by rmin, this shows that it is very similar to the others:
Yeah, the NVT system is a problem in that sense. But since we will likely want to do some NPT simulations the COM scaling is still something to have in our tool belt.
Yeah, really a question of whether one would want to do most of the simulations in NPT or NVT, which depends on the application.
Although I am still trying to rap my head around why we can scale the RDF so easily but scaling the trajectories doesn't work.
Still thinking about this a bit, in terms of what scaling the RDF actually does. Not entirely sure yet. The fact that scaling the RDF ignores the volume is an issue with any potential that is compressible, but one might get away with with relatively compressible materials . . .
So I am still thinking about the whole scaling idea for MBAR. But in the meantime I was looking at which approach is best for the PCF-PSO method and I discovered that in fact using rmin to scale is better than sigma. I had always assumed that would be true but the results presented previously suggested otherwise. However, before we had only looked at 135 K. If we look at the deviations with respect to temperature using sigma is only better at temperatures less than or equal to 135 K. But it quickly becomes worse at higher temperatures. To give you an idea, these are the percent deviations using constant sigma and constant rmin with respect to temperature:
Scaling by rmin makes more sense than scaling by sigma, in my opinion. So this is encouraging to see that at almost all temperatures rmin is better than sigma. Also encouraging to see that the percent error is relatively low. For example, using rmin the AAD is 1.2%. Meaning that you can predict the Potoff internal energies to about 1.2% accuracy just by simulating the TraPPE force field. If you consider the fact that TraPPE and Potoff deviate from each other by about 9.7%, the PCF-PSO approach with rmin can correct for about 88% of the deviation between the two models.
I have included a python script PCF_PSO_two_states_all_temps that will compare the different methods at multiple temperatures. This file requires TraPPE_all_PCFs to run.
Glad rmin makes sense, that makes everything fit together at an idea level better.
In terms of volume sampling - you mentioned an issue was that you couldn't predict density with the approach. However you certainly can predict pressure (there's an PCF based formula for pressure). If you can do pressure, then you can run at a number of volumes and extract the equation of state that way.
So I am still thinking about the whole scaling idea for MBAR.
I'm happy to explain more about how it might work. How USEFUL it is depends on a number of other things that would take some discussion.
I resimulated the TraPPE model using only 50 molecules (1/8th as many as before). The predicted internal energy for Potoff is now -6159 kJ/mol (on the same 400 molecule basis) which is in very close agreement with the actual value of -6223 kJ/mol. It is interesting that the N_eff value is still around 1.001.
Before I never plotted the pressure results. So here I compare how MBAR predicts P with 400 and 50 molecules.
400 molecules:
MBAR P for Model 0 (TraPPE): -33.5989142846 Direct P for Model 0 (TraPPE): -33.5989142846 MBAR P for Potoff: 816.98173947 Direct P for Potoff:-14.8469519487 MBAR P for Mess-UP: -45.3747290777 Direct P for Mess-UP:-58.6039326808
Notice the large overprediction in P for Potoff (which has poor overlap) and the relatively poor prediction for Mess-UP (considering how similar it is to TraPPE).
This shows the pressure with respect to snapshots. First, with Model 1 as Potoff:
Notice that the reruns are way higher P compared to where the Model 0 and Model 1 direct simulation are.
With Model 1 as Mess-UP we don't see this strong shift:
Now 50 molecules:
MBAR P for Model 0 (TraPPE): -180.134497461 Direct P for Model 0 (TraPPE): -180.134497461 MBAR P for Potoff: 104.048218013 Direct P for Potoff:-14.8469519487
If I use 50 molecules I do see a significant improvement for Potoff (although it is still not quantitatively accurate). Here we see that there is not as strong of a shift to higher pressures but still it is slightly shifted (the primary benefit being that the larger fluctuations allow for overlap with the pressures of interest in the direct simulation of Potoff):
You may have noticed that the pressure for TraPPE (Model 0, aka reference) is very different for 400 and 50 molecules. If we plot them together we see that they do fluctuate around a similar average, but the uncertainty with 50 molecules is much larger:
All in all, this suggests that we should be using smaller system sizes when trying to optimize our models. Maybe we could use 50 until it converges and then go to 400 so we don't have system size effects. Perhaps ITIC is not as subject to system size effects because we never actually simulate near the critical region. (Note that the reason why the green line above for Model 1 direct simulation has smaller fluctuations is because that is still for 400 molecules.)
Good data to inform choices.
On Thu, Aug 10, 2017 at 8:11 AM, Richard Messerly notifications@github.com wrote:
I resimulated the TraPPE model using only 50 molecules (1/8th as many as before). The predicted internal energy for Potoff is now -6159 kJ/mol (on the same 400 molecule basis) which is in very close agreement with the actual value of -6223 kJ/mol. It is interesting that the N_eff value is still around 1.001.
Before I never plotted the pressure results. So here I compare how MBAR predicts P with 400 and 50 molecules.
400 molecules:
MBAR P for Model 0 (TraPPE): -33.5989142846 Direct P for Model 0 (TraPPE): -33.5989142846 MBAR P for Potoff: 816.98173947 Direct P for Potoff:-14.8469519487 MBAR P for Mess-UP: -45.3747290777 Direct P for Mess-UP:-58.6039326808 <(603)%20932-6808>
Notice the large overprediction in P for Potoff (which has poor overlap) and the relatively poor prediction for Mess-UP (considering how similar it is to TraPPE).
This shows the pressure with respect to snapshots. First, with Model 1 as Potoff:
[image: image] https://user-images.githubusercontent.com/23408516/29174131-99be1778-7da2-11e7-87cf-c94a1868b8cc.png
Notice that the reruns are way higher P compared to where the Model 0 and Model 1 direct simulation are.
With Model 1 as Mess-UP we don't see this strong shift:
[image: image] https://user-images.githubusercontent.com/23408516/29174253-f08acb82-7da2-11e7-8a23-0807521e6a8b.png
Now 50 molecules:
MBAR P for Model 0 (TraPPE): -180.134497461 Direct P for Model 0 (TraPPE): -180.134497461 MBAR P for Potoff: 104.048218013 Direct P for Potoff:-14.8469519487
If I use 50 molecules I do see a significant improvement for Potoff (although it is still not quantitatively accurate). Here we see that there is not a strong shift to higher pressures but still it is slightly shifted:
[image: image] https://user-images.githubusercontent.com/23408516/29174331-33eca724-7da3-11e7-9de5-1fdd58787761.png
All in all, this suggests that we should be using smaller system sizes when trying to optimize our models. Maybe we could use 50 until it converges and then go to 400 so we don't have system size effects. Perhaps ITIC is not as subject to system size effects because we never actually simulate near the critical region. (Note that the reason why the green line above for Model 1 direct simulation has smaller fluctuations is because that is still for 400 molecules.)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ramess101/MBAR_LJ_two_states/issues/3#issuecomment-321562810, or mute the thread https://github.com/notifications/unsubscribe-auth/AEE31HEZKgJWjLRTe34AkWScuEvMPoltks5sWw-MgaJpZM4N0DYN .
Michael,
I am working on comparing PCF-PSO and MBAR some more. I have added a new folder titled "Ethane" to the "MBAR_LJ_two_states" repository on Github. I believe I already invited you to join that repository, but I have also attached the necessary files just in case.
The simulation results were obtained using gromacs by performing NVT simulations for ethane using different models. The energyi_j.txt files contain the simulation energies and pressures, where "i" refers to the model used to generate the configurations and "j" refers to the model used in the "rerun". For example, energy0_2.txt was simulated with model 0 (TraPPE) and rerun with model 2 (Mess-UP).
The purpose is to see how well MBAR predicts U for two different models (Potoff and Mess-UP) provided configurations obtained from a single simulation using a "reference" model (TraPPE). Potoff is a Mie potential and the overlap with the TraPPE model is very poor (only 1 sample from 1000 actually contributes). Mess-UP is a LJ potential that is very similar to the TraPPE potential and so it has much better overlap (from 1000 samples the number of effective samples is about 500).
I am trying to make sense of the results and I was hoping you could answer some of my questions.
Is the computing of expectation values done properly on line 112? Note that on line 112 I am providing A_kn, which is U_kn (see line 110) which is U_00 (see line 103). In other words, A_kn is just the internal energies of the TraPPE model evaluated with the TraPPE model (NOT with the new state, i.e. the Potoff or Mess-UP models).
Or, should I compute the expectation values like I did on lines 120-121? Note that I get the same result for EA_k[0] and EA_k_alt[0] but I get a very different value for EA_k[1] and EA_k_alt[1]. The difference is that for EA_k_alt[1] (line 121) I use the MBAR weights multiplied by U_01, i.e. the internal energies evaluated using the new state (Potoff or Mess-UP) from a rerun evaluation using the configurations for the reference state (TraPPE).
Compare "MBAR" and "MBAR_alt" for internal energy on the top figure (titled "Comparison of Different..."). The goal is to have the MBAR blue line or the MBAR_alt orange line match the "Simulated" green line. Recall that "Simulated" is obtained by actually simulating the new model, rather than using MBAR or "Rerun". Whereas, "Rerun" just assumes that the configurations all have a weighting of 1 (the PCF-PSO approach).
You may notice that when the new state is the Potoff model (i.e. new_state = 1) the MBAR_alt orange line is not only better than the MBAR blue line but it is better than the Rerun red line. (By "better" I mean that at "Model = 1.0" it matches the Simulated green line more closely.) On the other hand, when the new model is the Mess-UP model (new_state = 2) the "Rerun" red line agrees most closely with the "Simulated" green line. While the MBAR_alt orange line significantly under predicts internal energy and MBAR blue line significantly over predicts internal energy. This is very surprising since Mess-UP has better phase space overlap (number of effective samples is much greater for Mess-UP than for Potoff).
So why would the Potoff model work better with MBAR_alt than Mess-UP when Potoff only has one effective sample? Did Potoff just get "lucky"? Or is it because MBAR_alt assigned a weight of 1 to the only sample for Potoff that had an internal energy similar to what the real Potoff system would produce? For example, the only sample that contributes (sample ~750) also has the most negative internal energy (see "Model 1 Rerun" in the second figure from the top). Or is it because MBAR_alt is incorrect? If MBAR_alt is incorrect (and MBAR is correct) why then does "Rerun" outperform the Mess-UP model when Mess-UP has over 500 effective samples? Could it be because the pressure fluctuations are so large in the liquid phase that some states are assigned a greater weight than they really should have? For example, when using the Mess-UP model (i.e. new_state = 2) the second figure from the bottom shows that samples ~220 and ~750 are contributing too much to the average. This can be seen in the second figure from the top where samples ~220 and ~750 for "Model 2 Rerun" have internal energies that are much more negative than most of the internal energies for "Model 2 Simulated".
I know that it might be hard to follow exactly what I did and what I am referring to in my questions, so if something is unclear just let me know. Note that I have also included the main conclusions and my questions in the README file and the python code.
Finally, based on the information you provided in your previous emails, I have attempted to use smaller systems sizes (50 molecules rather than 400) and longer simulation runs (20 times longer). But the results I obtained were essentially the same. That is, Potoff still has around 1 effective sample and MBAR_alt is better than Rerun which is better than MBAR.
Also, sorry for the notation "MBAR_alt". I don't mean to imply that I have modified MBAR in any way. I am just not sure which "MBAR" result is the correct way to implement MBAR.
I look forward to hearing what ideas you might have.
Thanks again for your expertise.
Enjoy your time in Blacksburg!
Rich