Additional Compounds - Githubissues

ramess101 commented 6 years ago

@jrelliottoh @mostafa-razavi I am about to begin investigating other compounds to see how general our results are for MBAR/PCFR. I want to make sure that I am sampling the ITIC state points that you two would recommend. I have tried to follow what Mostafa's dissertation recommends and the values we used previously for ethane.

For example, for hexafluoroethane here are the ITIC points I am proposing to sample. I have included the TDE correlations to experimental data as well as the TraPPE and Potoff simulation results.

Here are the tabulated values:

State	Density (kg/m3)	Temperature (K)
IT,rho	237.8429	350
	475.6857	350
	713.5286	350
	951.3714	350
	1189.214	350
	1308.136	350
	1427.057	350
	1545.979	350
	1664.9	350
IC,rho0	1664.9	173.107
	1664.9	230.7399
IC,rho1	1545.979	207.912
	1545.979	259.8252
IC,rho2	1427.057	233.044
	1427.057	279.3548
IC,rho3	1308.136	252.82
	1308.136	293.5384

I use a Tr of 1.195 for the isotherm. The lowest isochore density corresponds to the triple point (which is Tr around 0.6). I have evenly spaced densities on isotherm from rho0-rho4 and then from rho5-rho8 the spacing is cut in half.

My main question is, how do you determine the intermediate temperature for the isochores? Is there some methodology or just a value somewhere between Tsat and the Tr=1.2 isotherm?

Do you see anything wrong with the state points I have selected for my ITIC runs?

mostafa-razavi commented 6 years ago

@ramess101

Do you see anything wrong with the state points I have selected for my ITIC runs?

Your IT and IC looks good. I produced these numbers using my program and the numbers you used (1.195 and 0.6) and I got similar numbers.

T_IT: 350.17 N_IT: 9 RHO_IT: 0.2376 0.4751 0.7127 0.9503 1.1879 1.3066 1.4254 1.5442 1.6630

RHO_HIGH: 1.6630 N_IC: 5 RHO_IC: 1.1879 1.3066 1.4254 1.5442 1.6630 T_IC1: 265.84 302.24 T_IC2: 249.12 291.13 T_IC3: 228.39 276.46 T_IC4: 203.87 257.70 T_IC5: 175.82 234.10

how do you determine the intermediate temperature for the isochores?

The middle point on each isochore should be equal to 1000/AVERAGE(1000/T_IT,1000/Tsat)

The only thing I could add is that to making sure TrSat's does not go beyond 0.85 if you are not only using B2. In your case max TrSat equals 0.86 which should be fine.

ramess101 commented 6 years ago

@mostafa-razavi

Great, thanks for looking at this.

ramess101 commented 6 years ago

@mrshirts

I have some very interesting results for hexafluoroethane. The reference force field is the TraPPE model (epsilon = 87 K, sigma = 0.436 nm, lambda = 12) and the target force field is Potoff (155.75 K, 0.4475 nm, 36). I emphasize that Potoff has an epsilon that is nearly twice that of TraPPE and a lambda of 36! So this is a really tough test for MBAR and PCFR. Note that the rmin for Potoff (0.475 nm) is actually less than that of TraPPE (0.489) despite having a larger sigma.

In brief, the result confirms that PCFR should be used to predict U when converting a LJ to a Mie potential (neither approach can predict Z in this case). MBAR performs extremely poorly because there are really no reference configurations from the LJ 12-6 that are even remotely feasible for the Mie 36-6 potential. Here is a parity plot (forgive the quality of the figures):

Clearly MBAR (and a constant PCF, i.e. weights = 1) fails miserably. I also included the "TraPPE" points to show that the TraPPE energies are actually not that different from the Potoff energies, but the configurations are obviously much different. Notice that PMF and zeroth (two different PCFR methods) are quite good. In the figure below I focus on the two PCFR methods.

The difference in these two methods is that PMF uses a predicted PCF based on a predicted PMF while the zeroth method just uses the exact zeroth order PCF. The "PMF" points are obtained using Equations 19 and 20 of my manuscript and the "Zeroth" points use Equations 22 and 24 (as of the date of this post). This figure actually suggests that the zeroth order approach is better than the PMF approach. I have observed a similar result for other longer chained n-alkanes, namely, slight improvement using the zeroth approach. Although the difference between PMF and zeroth for ethane was so small that I decided to use PMF, I think I need to change my recommendation in my paper. The zeroth approach actually has some significant advantages. First, it is much easier to implement. Second, it is more stable because it does not require bin spacing or other numerical issues. For these reasons, I am going to alter the PCFR development section. If you have not already started reading my manuscript, let me make these changes in the PCFR section. Specifically, I am going to recommend that the difference in the zeroth order PCF be used rather than trying to predict the entire PCF.

mrshirts commented 6 years ago

Thanks, Rich. No time to read it yet, so go ahead and add.

ramess101 commented 6 years ago

@mrshirts

I have now uploaded a polished rough draft. You will notice some significant changes. Specifically, I changed my recommended PCFR approach. I included the additional compounds results. I also reordered the PCFR derivation. I think that now it makes more sense to derive it the way that I originally did. I include the comparison between MBAR and PCFR within the derivation itself. Let me know what you think when you get the chance.

Thanks

ramess101 commented 6 years ago

@mostafa-razavi @jrelliottoh

Previously in Issue #2 I demonstrated that my Gromacs-ITIC results for propane and butane are consistent with Siepmann's GEMC results and reasonably consistent with Potoff's GOMC results. However, when I move to larger compounds I get strong disagreement. For example, here are my results for n-octane using the TraPPE model:

I have been looking over your thesis where you performed a thorough investigation with n-octane. Specifically, I am trying to replicate the work you did on pages 46-57. Currently, I am comparing my results with your Figures 2.13, 2.14, and 2.15. Before I show you my results, I have a few clarifying questions that could help me find the problem.

First, on page 58 you show the LJ parameters "used for comparing GEMC, GCMC, and ITIC", but were the results presented in Figures 2.13, 2.14, and 2.15 obtained with those parameters or with the TraPPE model? Second, what was the supercritical isotherm temperature that you used? I am using 682 K (1.2 Tc) but I noticed from your plots that 1000/T_IT is about 1.6-1.7 which would be about 588-625 K. Third, is the y-axis for Figure 2.15 unitless? Or does it have units of kJ K/mol? And why do you multiply uDep by T? In my plots I divide Udep by RgT to get it unitless and to make it linear vs 1000/T, with a similar shape to what you have.

Here are my plots that are analogous to yours:

The supercritical isotherm plot is on the same scale as your Figure 2.14 to facilitate a comparison. My REFPROP curves may not be reliable since the REFPROP correlation is only good to 600 K and my isotherm is 682 K. Similarly, the B2 TDE value is beyond its' range of reliability. But the point is that I see a strong difference from REFPROP and without knowing the isotherm from your figure I am not sure if my results are consistent with yours.

Note that my other two plots are on different scales than yours because of the strong differences in my results.

My Z along the isochore plot (compare to your Figure 2.13). Note the strong deviation between my isochores and those from REFPROP. REFPROP should be reliable at 1000/T > 1.66 and REFPROP IC3 and REFPROP IC4 are cutoff at the temperature where the correlation along these isochores is no longer reliable. So the strong deviations should be attributed to an error in my simulations. It is curious, however, that the colors seem to correspond to different isochores, but I have verified the densities used in simulation.

My Udep along the isochore plots. Although comparison to your Figure 2.15 is difficult due to the different y-axis, the strong deviation from REFPROP is cause for concern. Again, REFPROP values are reliably for the range plotted with 1000/T > 1.66. We again see that my simulation results agree more with different REFPROP isochores.

I have tried validating my GMX (gromacs) results by simulating the TraPPE potential on Towhee and Cassandra. I get similar angular energies and non-bonded energies. However, the dihedral energies differ strongly and the pressures (and hence Z) that I get are very different. I have verified that my dihedral parameters and non-bonded parameters are correct in Gromacs. I am trying to figure out if this is a 1-4 interaction problem. I have tried changing the scaling factor for 1-4 interactions but I did not observe any difference. In your Thesis you demonstrated that it is important but can be tedious to account for intramolecular LJ energies. However, in your case I don't think you saw such poor values for Z as well, just your Udep was affected, right? So I think it is more than just removing the 1-4 interactions from the Udep post-simulation. It appears that it has more to do with the simulations themselves, whether it be the force field or the integrator I am not sure.

I was considering repeating the torsion analysis you did of plotting the probabilities of different dihedrals. But I feel like my problems are more fundamental than poor torsional sampling. Do you have any other ideas for what I should look at next? Also, when computing Udep do you assume that the torsional energy is the same in the liquid phase as the ideal gas? I.e. is your Udep just the non-bonded energy? Or do you perform a single molecule simulation to determine the torsional energy of the ideal gas? I believe the torsional energy can change drastically from an ideal gas to a liquid phase for a large molecule, but I was not sure if you account for that difference in ITIC.

Thanks again for any insight you can provide.

ramess101 commented 6 years ago

@mostafa-razavi @jrelliottoh

Just as a point of reference, this is what my Z and U plots look like for the TraPPE model with propane and butane.

Propane:

Butane:

The main conclusion is that the shift in Udep is observed for propane and butane as it was for n-octane. This is expected as it contributes to the over-prediction of vapor pressure with the TraPPE model. So my poor results for n-octane should not be attributed to Udep. Instead, I believe the primary problem is Z along the isochore. Notice that propane and butane demonstrate a small systematic shift at higher temperatures while intersecting with REFPROP around Z=0. By contrast, n-octane has a completely different slope and intersection with Z=0. This intersection is pivotal to the value of Tsat, which should be about the same between REFPROP and TraPPE since TraPPE reproduces rhol quite accurately.

mostafa-razavi commented 6 years ago

@ramess101

First, on page 58 you show the LJ parameters "used for comparing GEMC, GCMC, and ITIC", but were the results presented in Figures 2.13, 2.14, and 2.15 obtained with those parameters or with the TraPPE model?

I have used the parameters shown in Table 2.5 using shifted-force Lennard-Jones potential function. That was our optimized set of LJ parameters at the time for our shifted-force LJ force field.

Second, what was the supercritical isotherm temperature that you used? I am using 682 K (1.2 Tc) but I noticed from your plots that 1000/T_IT is about 1.6-1.7 which would be about 588-625 K.

IT temperature was 600 K for n-octane.

Third, is the y-axis for Figure 2.15 unitless? Or does it have units of kJ K/mol? And why do you multiply uDep by T? In my plots I divide Udep by RgT to get it unitless and to make it linear vs 1000/T, with a similar shape to what you have.

In my plots, uDep is unitless and uDep*T has a units of kelvin. I multiplied uDep by T because uDepT is used in integration when calculating aDep.

I simulated TraPPE octane using LAMMPS with flexible bonds (kB = 191.75 kcal/mol/Å2), and I got the following ITIC results: trappe-c8

As you can see, vapor pressure and liquid densities are relatively consistent with Siepmann's results (unfilled circles). If I kept the bonds constant the consistency would be even better, but LAMMPS cannot keep the bonds of large molecules constant. The only difference between the above results compared to your GMX simulation is that (I assume) you are using LINC to keep the bonds fixed. Could that be a the reason? You can try to run one NVT simulation of nC8 at rho=0.708 gcc and T=600 K in GMX using kB = 191.75 kcal/mol/Å2 to see if you can reproduce the Z value of 6.846 +/- 2.084 that I obtained in LAMMPS.

I have tried validating my GMX (gromacs) results by simulating the TraPPE potential on Towhee and Cassandra.

Did you perform a complete ITIC simulation in Cassandra? If yes, can you post the results here? Based on my C12 simulations, performing ITIC using Cassandra NVT points gives rhoL and pSat that are consistent with Siepmann's TraPPE results. I will simulate C8 ITIC in Cassandra and will let you know, but most probably it will match Siepmann's data.

If it's not LINC, then I would say something is wrong in your GMX input files.

ramess101 commented 6 years ago

@mostafa-razavi

I have used the parameters shown in Table 2.5 using shifted-force Lennard-Jones potential function. That was our optimized set of LJ parameters at the time for our shifted-force LJ force field.

OK, that makes more sense. Since your model matches vapor pressure more accurately it is expected that the internal energies are in better agreement with NIST.

IT temperature was 600 K for n-octane.

Alright, that also helps explain why your NIST curve looks good.

In my plots, uDep is unitless and uDep*T has a units of kelvin. I multiplied uDep by T because uDepT is used in integration when calculating aDep.

OK, so uDep is (U-Uig)/RT, right? I was referring to Udep as (U-Uig). Then your uDep*T is really (U-Uig)/R while my plots are (U-Uig)/RT. Anyways, I think that clears that up for me.

I simulated TraPPE octane using LAMMPS with flexible bonds (kB = 191.75 kcal/mol/Å2), and I got the following ITIC results: As you can see, vapor pressure and liquid densities are relatively consistent with Siepmann's results (unfilled circles). If I kept the bonds constant the consistency would be even better, but LAMMPS cannot keep the bonds of large molecules constant.

Yes, I agree that the deviation between your ITIC results and Siepmann's is probably attributed to flexible bonds. But still they are way more accurate than what I am getting.

The only difference between the above results compared to your GMX simulation is that (I assume) you are using LINC to keep the bonds fixed. Could that be a the reason?

Yes, I am using LINCS. LINCS did not appear to be a problem for propane or butane, so I don't think LINCS is necessarily the problem unless Gromacs struggles to use LINCS with longer molecules. I failed to mention that I also saw strong deviations for pentane. So I still think it is the 1-4 interactions or the torsions.

You can try to run one NVT simulation of nC8 at rho=0.708 gcc and T=600 K in GMX using kB = 191.75 kcal/mol/Å2 to see if you can reproduce the Z value of 6.846 +/- 2.084 that I obtained in LAMMPS.

OK, I will try that and get back to you.

Did you perform a complete ITIC simulation in Cassandra? If yes, can you post the results here?

I only performed simulations with Cassandra at the two lower temperatures of the two most dense isochores (since those seemed to have the biggest problems with Z). I currently don't have a script for simulating ITIC with Cassandra since I don't anticipate using it much.

Based on my C12 simulations, performing ITIC using Cassandra NVT points gives rhoL and pSat that are consistent with Siepmann's TraPPE results. I will simulate C8 ITIC in Cassandra and will let you know, but most probably it will match Siepmann's data.

If you saw good results for C12 I assume you will see the same with C8. Thanks for offering but don't worry about running those simulations unless you want to. I don't want to add to your work load.

If it's not LINC, then I would say something is wrong in your GMX input files.

Yeah, I will test out LINCS, then the 1-4 interactions, and then the torsions. If none of that works I will reach out to the gromacs community.

Thanks a lot!

ramess101 commented 6 years ago

@mostafa-razavi

Just wanted to let you know that I figured out the problem. As expected, I had an error in my topology file. However, it was not LINCS, or 1-4 interactions, or torsions. The problem was "exclusions." For some reason I did not have exclusions set to 3, instead it was set to 8 (my script changed this value to the number of carbons when it should have been left constant). In GROMACS, exclusions are used when two sites within the same molecule should not have nonbonded interactions. This is typically set to 3, meaning that nonbonded interactions are excluded from all sites that are within 3 bonds of eachother (no 1-2, 1-3, or 1-4 nonbonded interactions, although 1-4 can be added separately). By having exclusions set to 8, my octane molecules did not have any 1-5, 1-6, 1-7, or 1-8 intramolecular nonbonded interactions. Obviously, this would have a big impact on energies and pressures.

Using the same model at the same state point that you simulated in LAMMPS, I now get a value of 6.862 for Z which is in excellent agreement with your value of 6.846. Note that before with exclusions = 8 I got Z =4.1.

You can try to run one NVT simulation of nC8 at rho=0.708 gcc and T=600 K in GMX using kB = 191.75 kcal/mol/Å2 to see if you can reproduce the Z value of 6.846 +/- 2.084 that I obtained in LAMMPS.

The output files from GROMACS do not distinguish between intramolecular and intermolecular nonbonded energies. As you mentioned in your thesis, this can be a bit of a pain when applying ITIC. Fortunately, now we know that by setting the number of exclusions to a value >= length of the chain we can eliminate intramolecular nonbonded energies. So all we need to do is perform a simulation with the correct number of exclusions (3) and then perform a rerun energy calculation (like we always do) but with exclusions = 8 (for octane) and then we will have just the intermolecular nonbonded energies for the correct configurations sampled.

I am still trying to figure out exactly what we want for Udep. I assumed that Udep should be just the intermolecular nonbondeds, but that does not appear to be what you thesis says. Could you clarify exactly what the conclusion was in your thesis concerning this issue of intramolecular and intermolecular nonbondeds for ITIC?

Here you say that LAMMPS INCLUDES the intra nonbonded in the output eVdw:

So from this I would think that you need to EXCLUDE the intra nonbonded to get the vapor pressures (and udep) to match. Since you say that including the intra contribution leads to an error in udep:

But then you say that the error in vapor pressure arises when intra nonbonded is neglected:

Do you mean that when intra nonbonded is EXCLUDED you see the error? Or that when you neglect to remove the intra contribution you see the error?

These figures seem to suggest that the error is when intra is EXCLUDED. Notice that udep and Psat match better when intra is INCLUDED. But I thought that INCLUDING intra nonbonded was the problem.

Could you clear this up for me?

Thanks

mostafa-razavi commented 6 years ago

@ramess101 Your VDW energy should not include intramolecular part to get the pressure right.

Using the same model at the same state point that you simulated in LAMMPS, I now get a value of 6.862 for Z which is in excellent agreement with your value of 6.846. Note that before with exclusions = 8 I got Z =4.1.

Can you repeat this simulation but use LINC if it doesn't take too much time from you? I would like to compare that with Cassandra.

The output files from GROMACS do not distinguish between intramolecular and intermolecular nonbonded energies.

I vaguely remember that in one of your emails you wrote that GROMACS does distinguish between intramolecular and intermolecular nonbonded energies

OK, are you just referring to distinguishing between the internal contributions from INTRA- and INTER- molecular interactions? If so, the Gromacs output files do separate the two terms nicely.

. Did you discover otherwise?

ramess101 commented 6 years ago

@mostafa-razavi

Your VDW energy should not include intramolecular part to get the pressure right.

OK, that is what I thought. Thanks for clarifying that.

Can you repeat this simulation but use LINC if it doesn't take too much time from you? I would like to compare that with Cassandra.

Yeah, no problem. I will post the results before the end of the day...hopefully.

I vaguely remember that in one of your emails you wrote that GROMACS does distinguish between intramolecular and intermolecular nonbonded energies

OK, are you just referring to distinguishing between the internal contributions from INTRA- and INTER- molecular interactions? If so, the Gromacs output files do separate the two terms nicely.

. Did you discover otherwise?

Perhaps I misunderstood how you were using INTRA and INTER back when I posted that a while ago. Gromacs DOES distinguish the INTRA contributions (like harmonic bonds, angles, torsions) and the INTER contributions (what Gromacs calls LJ (short range)) to the internal energy. It does NOT distinguish the 1-5, 1-6, etc. Lennard-Jones INTRA interactions from the truly INTERmolecular nonbonded LJ. In other words, the 1-5, 1-6, etc interactions are lumped into the LJ (SR) term. I believe the 1-4 interactions have their own term in the output file, but I am not positive. Sorry for the confusion.

ramess101 commented 6 years ago

@mostafa-razavi

Using LINCS I get a compressibility factor of 6.8 +- 1. So similar to what I got with the harmonic bonds. Note that this uncertainty of 1 is just what Gromacs provides from the standard deviation. The actual precision is probably much better.

What do you get using Cassandra?

Here are all the properties Gromacs provides that you might be interested in. Note that internal energies are on a 400 molecule basis:

Also, if you want just the LJ (SR) from intermolecular you have to perform a rerun calculation with nrexcl = 8. This gives the same angle, torsion, KE, etc. but a different LJ (SR):

The difference between the top LJ (SR) (which includes INTRA nonbonded) and the bottom LJ (SR) (which is only INTER nonbonded) is:

(-12737.8)-(-11982.4)=-755.4

So the INTRA nonbonded energy is -755.4 kJ/mol and the INTER nonbonded is -11982.4 kJ/mol (both per 400 molecules).

How does that compare with Cassandra?

ramess101 commented 6 years ago

@mostafa-razavi

Although I was able to reconcile our octane simulations at the single state point (see above), I was not able to get accurate predictions of vapor pressure.

Our plots for n-octane with the TraPPE model look fairly similar (although we used slightly different densities and temperatures).

Here are the plots for Udep. Although the y-axis scale is slightly different we see the same trends, i.e. TraPPE over predicts relative to REFPROP:

Yours:

Mine:

and for Z we see that TraPPE under predicts:

Yours:

Mine:

Note that the figures you provide me with had harmonic bonds while mine have fixed bonds, but they are still in pretty good agreement.

However, my vapor densities and consequently my vapor pressures are way off:

Yours:

Mine:

To make for an easier comparison, could you send me the exact isotherm densities and isochore temperatures you simulated for n-octane with TraPPE? I know you reported IT = 600 K as well as the IC densities and I could try to get the rest from the figures but I want to simulate the exact same state points. The box sizes and number of molecules would also work instead of the densities.

I am using B2 and B3 from REFPROP, so maybe that is the problem? I tried playing around with B2 but it didn't seem to fix anything. Could you provide me with the B2 and B3 values that you got for TraPPE and used to generate your figures?

Also, when computing Udep you just take the INTERmolecular nonbonded energies, right? This assumes that U_intra is the same for the ideal gas and the condensed phase. Do you think that is a good assumption for these longer molecules? I would expect that U_intra is quite different between the ideal gas and the condensed phase as the chain-length increases. Have you tried performing a single molecule "ideal gas" simulation to compute U_intra and compare that to U_intra in the condensed phase? I tried that and my results appeared to get worse, but it seems like that is the more rigorous approach.

Sorry for being so needy. Hopefully my questions are helpful to you as well as you perfect the ITIC approach.

mostafa-razavi commented 6 years ago

What do you get using Cassandra?

I'll report once the simulation is done.

To make for an easier comparison, could you send me the exact isotherm densities and isochore temperatures you simulated for n-octane with TraPPE? I know you reported IT = 600 K as well as the IC densities and I could try to get the rest from the figures but I want to simulate the exact same state points. The box sizes and number of molecules would also work instead of the densities.

These (RunITIC.txt) are the exact temperatures and densities that I used for those simulations that you attached.

I am using B2 and B3 from REFPROP, so maybe that is the problem? I tried playing around with B2 but it didn't seem to fix anything. Could you provide me with the B2 and B3 values that you got for TraPPE and used to generate your figures?

I was using DIPPR values of B2 at the time for C8, but it is not exact. We should use B2 and B3 of TraPPE. B2 of IT is the most important B2 value used in ITIC method which is exactly equal to -4.3976 cc/g at 600K. Now I'm using Schultz's B2 values that I'm attaching (TraPPE-C8-B2B3.txt). You can make a correlation of B2 wrt T and put it in your program. I use these correlations for C8:

B2(cc/g)=-7.4672(1/Tr)^3+17.7516 (1/Tr)^2-25.9879 (1/Tr)+10.659 B3(cc^2/g^2)=-678.8960(1/Tr)^3+2356.0898(1/Tr)^2-2729.2549(1/Tr)+1064.2969

which have negligible deviation from Schultz's B2 and B3 values in 300-600K range for B2 and 350-520K range for B3.

Please let me know if your results improve when you use these correlations.

Also, when computing Udep you just take the INTERmolecular nonbonded energies, right? This assumes that U_intra is the same for the ideal gas and the condensed phase. Do you think that is a good assumption for these longer molecules? I would expect that U_intra is quite different between the ideal gas and the condensed phase as the chain-length increases. Have you tried performing a single molecule "ideal gas" simulation to compute U_intra and compare that to U_intra in the condensed phase? I tried that and my results appeared to get worse, but it seems like that is the more rigorous approach.

I'm looking into it

Sorry for being so needy. Hopefully my questions are helpful to you as well as you perfect the ITIC approach.

It's OK. Your points are definitely helpful. Thanks

ramess101 commented 6 years ago

@mostafa-razavi

I'll report once the simulation is done.

Thanks!

These (RunITIC.txt) are the exact temperatures and densities that I used for those simulations that you attached.

Great, thanks! I will let you know when my simulations are done.

Please let me know if your results improve when you use these correlations.

Using your B2 and B3 correlations did not seem to fix it. But maybe once we get my values in the ballpark these will help refine my results.

I'm looking into it

Great! I talked with @mrshirts about this. He agreed that we shouldn't assume the ideal gas and condensed phase intramolecular energies are the same. I verified this by performing a single molecule MD run and compared this with the condensed phase. In every case the energies were different by a significant amount. However, this actually made my results worse. Michael expressed concern about performing a single molecule MD run. He said this might not give us the correct ideal gas values for our force field, especially if we use velocity-verlet. But I got the same ideal gas values when I used a stochastic integrator and even when I used MC.

ramess101 commented 6 years ago

@mostafa-razavi

OK, so I performed the ITIC simulations at the exact same state points that you provided for n-octane. I used the TraPPE model and the harmonic bond potential that you recommended (since that was what you used in your simulations). My results suggest that there is an issue with the pressures I get from Gromacs.

If you recall, previously we validated that I got the same compressibility factor at the highest density along the isotherm (included in figure below). However, it appears that at lower densities along the isotherm there is a systematic error in Z.

If you recall, your (Z-1)/rho plot agrees much more closely with REFPROP at low densities:

The Z values along the isochores look OK (except at Tsat):

The internal energies actually look very similar.

So I think the primary culprit for the poor results is Z along the isotherms.

The impact of the poor Z estimates is apparent when plotting Adep:

Mine:

Yours:

I have tried to make all our figures have the same ranges and tick marks, but it would be easier to make the comparison if I had your exact values. If it is not too much of a hassle, could you send me your Z and U values as well?

jrelliottoh commented 6 years ago

I guess this is the best you can do if Tr=0.6 at the triple point. It looks like Tr~0.86 at 253K. The intermediate temperatures are chosen as halfway between 1/Tsat and 1/Tisotherm. We space evenly on the reciprocal temperature scale because the is the relevant variable for integration.JRE

On Thursday, January 11, 2018, 3:25:50 PM EST, Richard Messerly <notifications@github.com> wrote:

@mostafa-razavi

OK, so I performed the ITIC simulations at the exact same state points that you provided for n-octane. I used the TraPPE model and the harmonic bond potential that you recommended (since that was what you used in your simulations). My results suggest that there is an issue with the pressures I get from Gromacs.

If you recall, previously we validated that I got the same compressibility factor at the highest density along the isotherm (included in figure below). However, it appears that at lower densities along the isotherm there is a systematic error in Z.

If you recall, your (Z-1)/rho plot agrees much more closely with REFPROP at low densities:

The Z values along the isochores look OK (except at Tsat):

The internal energies actually look very similar.

So I think the primary culprit for the poor results is Z along the isotherms.

I have tried to make all our figures have the same ranges and tick marks, but it would be easier to make the comparison if I had your exact values. If it is not too much of a hassle, could you send me your Z and U values as well?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ramess101 commented 6 years ago

@jrelliottoh

These results are actually for octane, so 253 K is around Tr = 0.45. I don't think it is a problem with how we set-up ITIC, I think it is a Gromacs issue or a user error. I am currently running jobs with a shorter time-step (before was 2 fs, now 1 fs) just to see if somehow that was the problem.

mrshirts commented 6 years ago

Are you running any of these in gromacs with constraints, or not with constraints? insufficiently converged constraints might give incorrect pressures (or incorrect reported pressures). What is the estimated magnitude of the error?

Does this involve the pressure generated by rerun at all?

ramess101 commented 6 years ago

@mrshirts

Are you running any of these in gromacs with constraints, or not with constraints?

These results are actually with a harmonic bond model, i.e. no constraints. I am using this model so that I can compare my results with those that @mostafa-razavi generated using the same harmonic potential in LAMMPS.

insufficiently converged constraints might give incorrect pressures (or incorrect reported pressures). What is the estimated magnitude of the error?

I looked into this when you mentioned it last week. The constraints RMSD is order 10^-7. I increased the nlincs_order to 8 (default is 4) and saw a decrease in RMSD of about a factor of 2. Although this did change the pressures slightly it was not enough to resolve the discrepancy.

Does this involve the pressure generated by rerun at all?

I do use the rerun pressures in my analysis (just because my code is written to work with MBAR if the parameters change). These should be nearly identical to the direct simulation because I use the same force field parameters in both. I know that the precision is a little bit worse since I only have 1000 snapshots instead of the 100,000 time-steps. But the large bias in my results suggests it is not a "large uncertainty" problem. Was this your concern?

mrshirts commented 6 years ago

But the large bias in my results suggests it is not a "large uncertainty" problem. Was this your concern?

The rerun velocities are off, which means you should be recomputing from the virial, not the pressure. I thought you were already doing that though?

Also, I also wanted clarification if the pressures were off JUST in reruns, or if they are indeed off in direct simulation as well.

ramess101 commented 6 years ago

@mrshirts

The rerun velocities are off, which means you should be recomputing from the virial, not the pressure. I thought you were already doing that though?

I am outputting velocities at the same frequency as coordinates (and energies). Will the rerun velocities still be off? I only use the virial when I develop my basis functions. In this case I am not doing anything with basis functions because I am just analyzing the reference simulation, so I used pressure instead.

Also, I also wanted clarification if the pressures were off JUST in reruns, or if they are indeed off in direct simulation as well.

I verified that the direct simulation pressures are off too. They are essentially the same. Because of the large fluctuations they look slightly different but in either case Z is way different than what it should be. For example, this is the Z isotherm from direct simulation:

mrshirts commented 6 years ago

Rerun velocities are almost certainly incorrect (maybe some combination of conditions) - there should be warning printed? I think right now they might just be the initial kinetic energy in many cases. This is because rerun velocities depend on the integration algorithm, which is impossible to reconstruct with just frames. Pressure would also be a little off, because it involves KE. Virial for harmonic simulation should be OK (probably not for constraints, because of the virial contribution from the constraints. If you want to calculate an overall pressure by calculating the average of the virial and adding the analytical contribution of KE, that should work fine, though.

I verified that the direct simulation pressures are off too. They are essentially the same. Because of the large fluctuations they look slightly different but in either case Z is way different than what it should be. For example, this is the Z isotherm from direct simulation:

Can you give me a simulation you ran with what the pressure WAS, and tell me what it should be? We can try to figure out what is causing the deviation.

To m, those Z curves don't look that different, but I don't have a sense of the needed accuracy :)

Perhaps we should talk in person on this when discussing paper?

ramess101 commented 6 years ago

@mrshirts

Yeah I think talking in person would be helpful. The (Z-1)/rho curves need to be very accurate because the integral is used to compute Adep which rhov and Psat depend on.

I am still confused about the rerun issue because my rerun pressures are essentially the same as the direct simulation pressures. If I change the frequency for recording the velocity (vout < xout) Gromacs does return a warning and the pressures are nonsensical.

FYI I also performed simulations that were three times longer (both equilibration and production) but the results were unchanged. The fluctuations in pressure are much greater using harmonic bonds than constrained bonds, so I am going to repeat this with LINCS to see if it gets any better.

mrshirts commented 6 years ago

Maybe send a quick sample script (easy, one rerun) for what you are doing? And again, a test case where you have a prediction for what the simulation pressure would be (not experimental, but simulation with same parameters, like Cassandra simulation).

I suspect LINCS will have larger problems with rerun pressures being wrong (constraint virial can't be computed from single snapshot).

ramess101 commented 6 years ago

@mrshirts Thanks, I will post my Gromacs files tonight. Mostafa’s valued are currently the best to compare to (he used LAMMPS) but I don’t have the exact numbers yet. I do have some values using Towhee, but that was with constrained bonds.

mrshirts commented 6 years ago

Comparison to LAMMPS is decent enough, IF we have the lammps run files. I want to know exactly what the differences are in the runs to make it easier to track down.

mostafa-razavi commented 6 years ago

@mrshirts Attached is my TraPPE-C8-LAMMPS simulation files. LammpsRdr.res have the exact values that I used in my plots. DataFiles folder contains lammps datafiles for each density. IT and IC folders contain all simulation file such as log file and output files.

@ramess101 I ran into some difficulty simulation rigid-bond C8 with Cassandra. I'll keep you updated about that. Can you send your Gromacs input files for C8? I would also like to take a look.

TraPPE-C8-LAMMPS.zip

ramess101 commented 6 years ago

@mostafa-razavi

Thanks for posting all this! This will be very helpful. I have plotted your Z (and Z-1/rho) values and I have included uncertainties for my values. It looks like they actually agree very closely except for the three lowest densities. I noticed that you used 150 molecules for state point except those three, for which you simulated 600 molecules. By contrast, I used 400 molecules in every simulation. I am going to try simulating larger system for these low density IT points and see if that fixes it.

ramess101 commented 6 years ago

@mostafa-razavi @mrshirts

I have attached a directory of my scripts, input files, raw data, and final results. TraPPE_long contains the values I plotted.

TraPPE_C8_harmonic.zip

mrshirts commented 6 years ago

@mostafa-razavi @ramess101

Hmm. A lot of files here. To maximize chance of finding what is needed, can you two point me to the single set of NVT GROMACS .top/.gro/.mdp files and LAMMPS input files that you most want to figure out why they don't agree?

ramess101 commented 6 years ago

@mrshirts Yes, sorry for the file dump. My files can be found at this path: Isotherm/rho_#/NVT_eq/NVT_prod where # is either 0, 1, or 2, corresponding to the three lowest densities on the isotherm. nvt_prod.mdp and md_nocord.mdp are found there as well as the log files. To make the files small enough for Github I actually removed the .gro files. The .inp file is at the root directory under C8H18_ref0 (something like that, typing this from memory on my phone). Most of the other files in the root directory are just included so that if you walked through the scripts you would know what the files will look like.

mostafa-razavi commented 6 years ago

@mrshirts Since the biggest discrepancy between my LAMMPS results and Richard's Gromacs results occurs in the lowest density on isotherm, I attached LAMMPS data file (0.10114.data) and input file (input.in) for this state point (T=600K and rho=0.10114 gcc). I also attached the log file (log.lammps) and the result of my post processing (LammpsRdr.res). I hope this helps.Please let me know if you need any other information.

LammpsFiles.zip

mostafa-razavi commented 6 years ago

@ramess101 Attached is the Cassandra results for TraPPE-C8-rigid-bonds.

CASNVT-TraPPE-C8.zip

ramess101 commented 6 years ago

Attached is the Cassandra results for TraPPE-C8-rigid-bonds. How does that compare with Cassandra?

Here is a comparison between the values @mostafa-razavi provided using both LAMMPS and Cassandra with those I obtain using Gromacs (GMX). Since LAMMPS used harmonic bonds I have a comparison with my GMX runs using harmonic bonds. Since Cassandra used fixed bonds I compare those values with my LINCS simulations. I include Mostafa's results in each figure to provide a common frame of reference (this is especially useful for Udep).

Here are the GMX results for harmonic bonds (compare with LAMMPS):

Here are the GMX results for fixed bonds (compare with Cassandra):

The primary conclusions are:

The harmonic bond results demonstrate strong deviations in Z (pressure) for both IT and IC. Previously I thought that my IC results were pretty good for the harmonic simulations but I found a small error in those simulations that was causing the values to look better than they should be (specifically, my previous results accidentally used a fixed bond model during equilibration). The results presented here are for the files that I posted for @mrshirts to look at it, i.e. the more reliable results that use harmonic bonds in both equilibration and production.
The fixed bond results are in much closer agreement for Z, only demonstrating strong deviations at low densities along IT. However, there does appear to be a consistent (albeit relatively small) negative bias for both IT and IC compared to Cassandra.
Both the harmonic and fixed bond results agree closely with the LAMMPS Udep. However, the Cassandra Udep appears to be much different. I tried including and excluding INTRA Vdw for Cassandra but in both cases the results were not consistent with Gromacs and LAMMPS.

@mostafa-razavi Can you verify that for both the Cassandra and LAMMPS values you provided, Udep = eVdw - IVdw? Using eVdw-IVdw the Cassandra Udep values are 2-5% greater than those from LAMMPS. I know this could be related to using harmonic bonds in LAMMPS compared to fixed bonds in Cassandra, but I got essentially the same values for Udep using harmonics and fixed bonds in Gromacs, so I wanted to double-check.

I ran some fixed bond simulations using 800 molecules over the week-end. I will post those results after a couple meetings I have this afternoon.

ramess101 commented 6 years ago

@mostafa-razavi @mrshirts

I think we have resolved this issue!!!

Here are my simulation results using fixed bonds and 800 molecules at every state point. The agreement is excellent for Z at every state point relative to Cassandra (which also used fixed bonds). Udep agrees very well with LAMMPS (despite LAMMPS having harmonics) and the disagreement with Cassandra is likely a misinterpretation on my part of the values Mostafa reported. The deviation between LAMMPS Z is not concerning, especially considering the large uncertainties in pressures for harmonic bonds.

Now that we get consistent results for Udep and Z, the ITIC results look great:

Deviations with respect to REFPROP:

Deviations between a fit to the ITIC results and the Siepmann values:

The agreement of ITIC with Siepmann's GEMC results is now as good as I observed for the smaller n-alkanes. Specifically, liquid density agrees within 1% and vapor pressure deviates the most at low T, but within 3-10%.

Thank you everyone for helping resolve this issue!!! I am confident that it would have taken me 2-3 times longer (if ever) to find the problem without your input, so thank you very much.

In conclusion, these were the main issues we ran into:

Harmonic bonds caused large variations in pressures
Excluding the INTRA portion of the Vdw interactions
Number of molecules (especially at low density on isotherm)

Most of these issues have already been addressed by Mostafa elsewhere, so perhaps not a ton of insight was gained by my struggles. But we (or at least I) now have a greater appreciation for the care one must take to get Udep and Z that are accurate enough for ITIC. In addition, we can now compare Gromacs, LAMMPS, and Cassandra for this system.

@mostafa-razavi Do you think it would be worthwhile for me to repeat my Gromacs runs using the same number of molecules as you? I used 800 for every state point while you used 600 at low density and 150 everywhere else. I am satisfied with my results, but for comparison with LAMMPS/Cassandra would you benefit from me using 600/150 like you did?

mostafa-razavi commented 6 years ago

Udep agrees very well with LAMMPS (despite LAMMPS having harmonics) and the disagreement with Cassandra is likely a misinterpretation on my part of the values Mostafa reported.

The way I calculate uDep in my ITIC code is

uDep = ( ePot - eBond - eIntraVdw ) / (Nmolec 1.98710^-3 * T)

where ePot is total potential energy and eBond is bonded intramolecular energy (bond, angle, torsion)

Using this formula, uDep from Cassandra (fixed-bond) and LAMMPS (flexible-bond) match (less than %0.5 difference). However, when we use

uDep = ( eVdw - eIntraVdw ) / (Nmolec 1.98710^-3 * T)

the Cassandra Udep values are 2-5% greater than those from LAMMPS. As you said this is due to eBond which is lower in Cassandra since we have fixed bonds. In other words, eVdw (and eIntraVdw) is influenced by bonds being fixed or flexible.

Do you think it would be worthwhile for me to repeat my Gromacs runs using the same number of molecules as you? I used 800 for every state point while you used 600 at low density and 150 everywhere else. I am satisfied with my results, but for comparison with LAMMPS/Cassandra would you benefit from me using 600/150 like you did?

I don't think we need to repeat the Gromacs simulations with 150(x4) molecules, but it would be useful to have the exact values of your Gromacs simulations (fixed-bond and flexible-bond). I couldn't find them in your "TraPPE_C8_harmonic.zip" attachment. Can you point out to the right file? I basically need a file a file that looks like my CassandraRdr.res or LammpsRdr.res files. It would also be great if I had the values of pSat and rhoL and rhoV and possibly Hvap. I would like to input your Gromacs results into my ITIC code and compare the results with your pSat, rhoL, rhoV, and Hvap.

Also, does the y axis in your uDep plot represent uDep/R or (U-Uig)/R? my uDep is defined as (U-Uig)/RT. I think your y-axis should be (U-Uig)/R or uDep*T.

ramess101 commented 6 years ago

@mostafa-razavi

The way I calculate uDep in my ITIC code is

uDep = ( ePot - eBond - eIntraVdw ) / (Nmolec 1.98710^-3 * T)

where ePot is total potential energy and eBond is bonded intramolecular energy (bond, angle, torsion)

Using this formula, uDep from Cassandra (fixed-bond) and LAMMPS (flexible-bond) match (less than %0.5 difference).

OK, thanks for clarifying that. Now the energies are in excellent agreement. The Gromacs and Cassandra values are nearly identical.

These are my fixed bond length simulation results:

does the y axis in your uDep plot represent uDep/R or (U-Uig)/R? my uDep is defined as (U-Uig)/RT. I think your y-axis should be (U-Uig)/R or uDep*T.

Yes, my y-axis is (U-Uig)/R. I use Udep (capital U) to differentiate my Udep from your definition of uDep. Sorry for the confusion.

Can you point out to the right file? I basically need a file a file that looks like my CassandraRdr.res or LammpsRdr.res files. It would also be great if I had the values of pSat and rhoL and rhoV and possibly Hvap. I would like to input your Gromacs results into my ITIC code and compare the results with your pSat, rhoL, rhoV, and Hvap.

I will post these files either tonight or tomorrow. I would also be very interested to see how similar our ITIC values are using the same Udep and Z values.

How are you computing Hvap? I can think of a few ways to approximate it. For example, are you assuming the vapor phase is an ideal gas since we don't actually simulate the saturated vapor?

ramess101 commented 6 years ago

@mostafa-razavi

I have attached a folder with the desired files. Please read the README as it explains what is in each file.

Please note that my GromacsRdr.res file does not have the exact same energy columns as your files, namely, I have a single column that corresponds to the INTERmolecular Lennard-Jones energy, i.e. uDep = ( UINTERvdw ) / (Nmolec 1.98710^-3 * T).

Also, note that I calculated ITIC using both the REFPROP B2 and B3 values as well as the correlations you provide me with for B2 and B3.

Gromacs_ITIC_comparison.zip

mostafa-razavi commented 6 years ago

@ramess101 I used Gromacs results (GromacsRdr.res) in my code and I got the following results. These tables provide deviations between Mostafa's ITIC results and Richard's ITIC results calculated this way:

Dev % = (Richard - Mostafa)/Mostafa*100

Schultz's B2_IT:

Mostafa’s Tr	Tsat Dev%	Psat Dev%	rhoL Dev%	rhoV Dev%
0.87	0.051	-0.873	0.002	-1.419
0.80	-0.017	-1.196	-0.002	-1.422
0.72	-0.013	-1.704	-0.007	-1.833
0.62	-0.009	-1.308	0.005	-1.331
0.51	-0.005	-3.028	0.000	-3.031

This was when B2_IT=-4.3839 ccg (Schultz value). Since the pSat deviation at Tr=0.51 is large I thought maybe you are using B2_IT of REFPROP (4.8816 ccg) instead of Schultz's value of -4.3839 ccg. Is that right? When I used B2_IT=4.8816 ccg deviations improved, but still there is some deviations that I can't figure out.

REFPROP B2_IT:

Mostafa’s Tr	Tsat Dev%	Psat Dev%	rhoL Dev%	rhoV Dev%
0.87	0.076	1.770	0.002	2.399
0.80	-0.008	0.913	-0.002	1.129
0.72	-0.011	0.132	-0.007	0.160
0.62	-0.009	0.411	0.005	0.432
0.51	-0.005	-1.381	0.000	-1.382

I can think of a few sources of discrepency: 1- Number of fixed-point iterations. In other words, the tolerance use to terminate iterations 2- Initial guess for rhoV in fixed-point method

Can you show me the piece of ITIC code that you are using in the repository?

ramess101 commented 6 years ago

@mostafa-razavi

Thanks for doing this analysis.

I thought maybe you are using B2_IT of REFPROP (4.8816 ccg) instead of Schultz's value of -4.3839 ccg. Is that right?

The file named "ITIC_TraPPE_Schultz_B2_B3_Gromacs" used the Schultz B2 and B3 correlations you gave me. The files named "ITIC_TraPPE_REFPROP_B2_B3_Gromacs" used the REFPROP B2 and B3 values.

Which file did you use in your comparison?

mostafa-razavi commented 6 years ago

@ramess101

I used ITIC_TraPPE_Schultz_B2_B3_Gromacs. OK, so you also used my Schultz's correlation to get the B2 at isotherm temperature, i.e. B2_IT=-4.3839 ccg. Right? In that case, there is a discrepency between our codes which is reflected in the first table in my previous comment.

ramess101 commented 6 years ago

@mostafa-razavi

B2_IT=-4.3839 ccg. Right?

Yes. Although we must use a slightly different Tc in the correlation since I get -4.3845 ccg, but that shouldn't matter much.

I can think of a few sources of discrepency: 1- Number of fixed-point iterations. In other words, the tolerance use to terminate iterations 2- Initial guess for rhoV in fixed-point method

I verified that my rhov initial guess and iteration tolerance does not impact my results.

This discrepancy does not surprise me since I am using a slightly different data analysis approach than you are. For example, I fit a polynomial to (Z-1)/rho vs rho and U/RT vs 1/T before integrating for Adep. So in essence, you perform a numerical (approximate) integration of the "exact" simulation output while I approximate the simulation output with a polynomial fit and then perform an "exact" integration. I assume this is the primary reason for the discrepancy.

I guess one question is, do your results agree more closely with the TraPPE values reported by Siepmann using GEMC? I assume both our values would be within the reported uncertainties though.

mostafa-razavi commented 6 years ago

Hmmm... What is the degree of polynomial that you are using for (Z-1)/rho vs rho and U/RT vs 1/T?

ramess101 commented 6 years ago

I use either a third or fourth order polynomial fit for Z-1/rho vs rho (they are usually very similar) and a linear fit for U/RT vs 1/T.

mostafa-razavi commented 6 years ago

I compared linear vs quadratic interpolation along isochores in my NIST Validation analysis for C12. It turned out that the curvature in (U-Uig)/R vs 1/T has a relatively significant impact on pSat at high densities (e.g. TrSat=0.45).

Plot of (U-Uig)/R: udept-vs-1000overt

Plot of (U-Uig)/(RT): udep-vs-1000overt

Psat plot when linear regression is used for integrating along isochore: psat-islineartrue

Psat plot when quadratic regression is used for integrating along isochore: psat-islinearfalse

NIST validation results when uDep is integrated using Simpson's rule:

NIST validation results when uDep is integrated using linear regression:

ramess101 / MBAR_ITIC

Additional Compounds #9