Open GiovanniBussi opened 3 years ago
Actually, the log files report something like this at the end of the simulation: State A
...
MC-lambda information
N VdwL Count G(in kT) dG(in kT)
1 0.000 627679 0.00000 0.80801
2 0.200 620237 0.80801 0.80064
3 0.400 614937 1.60866 -0.07378
4 0.500 613123 1.53488 -0.27348 <<
5 0.600 612967 1.26140 -0.75272
6 0.700 619788 0.50868 -1.39003
7 0.800 628721 -0.88135 -1.73766
8 1.000 662548 -2.61900 0.00000
...
State B
...
MC-lambda information
N VdwL Count G(in kT) dG(in kT)
1 0.000 615655 0.00000 0.71877
2 0.200 617183 0.71877 0.53932
3 0.400 598749 1.25809 -0.26488
4 0.500 604316 0.99321 -0.42375
5 0.600 615069 0.56946 -1.46753
6 0.700 629927 -0.89807 -2.02460
7 0.800 641608 -2.92267 -1.92462 <<
8 1.000 677493 -4.84729 0.00000
...
So the difference: 4.84729 - 2.61900=2.22829
might explain the difference that you observe (if I interpret correctly and these are the WL weights).
It looks like these weights are fluctuating along the simulation, so "discounting" for them would be slightly more tricky (but I think still possible).
Anyway, my initial understanding was that you wanted to replace WL on lambda with METAD.
Hi @GiovanniBussi , thanks a lot for the quick reply! In alchemical metadynamics, I believe that the Wang-Landau-related parameters are either deactivated or overwritten. The statements below are all based on the assumption that the output files (log file, COLVAR, and HILLS) are reflecting things correctly. In brief, parameters in GROMACS mdp file can only influence the update of the alchemical states, not their weights. The weights are only controlled by the parameters in plumed.dat
.
Specifically, in expanded ensemble simulation, nstexpanded
is the number of integration steps between attempted moves in the alchemical space. In expanded ensemble, whenever the system moves to another state, the value of G (or weight) of the new state would decrease by init-wl-delta
. On the other hand, if the system decides to stay at the original state, the weight of the original state would decrease by init-wl-delta
. The value of init-wl-delta
will be scaled by a factor of wl-scale
if the flatness criterion is reached. (The criterion is defined by wl-ratio
.). That is, in expanded ensemble, nstexpanded
controls the frequency of updating the states and their weights at the same time. On the other hand, in alchemical metadynamics, nstexpanded
and other parameters (basically the parameters you listed above) are only for updating the states. The deposition of the weights/Gaussians is only controlled by the parameters in plumed.dat
, including PACE
(the stride for adding Gaussians) and HEIGHT
(the Gaussian height, which replaces the GROMACS parameter init-wl-delta
).
To confirm this, I've performed a simple experiment (already pushed to the folder Problem_1/state_A/test
), in which I changed nstlog
. (All other parameters stay the same.) Below are some interpretation of the result:
Before 1 ps, the weight of each state remained 0 even if the counts were accumulated. → This indicates that nstexpanded
worked as described above. (In expanded ensemble, the weight would change whenever the count of the state changes.)
At 1 ps, the first Gaussian centered at (theta, n) = (-3.030568, 0) was deposited and below is the corresponding part of the log file:
Step Time
500 1.00000
MC-lambda information
N VdwL Count G(in kT) dG(in kT)
1 0.000 42 0.00000 1.00050 <<
2 0.200 7 1.00050 0.00000
3 0.400 1 1.00050 0.00000
4 0.500 0 1.00050 0.00000
5 0.600 0 1.00050 0.00000
6 0.700 0 1.00050 0.00000
7 0.800 0 1.00050 0.00000
8 1.000 0 1.00050 0.00000
The value 1.00050 comes from the fact that the HEIGHT
parameter was set as 2.478956208925815 kJ/mol or 1 kT. (The difference of 0.00050 kT is probably just the rounding error in unit conversion.) This is the evidence showing that the GROMACS parameter init-wl-delta
was ignored, or the value should have be 0.5 kT instead of 1 kT. (Note: Since 1.00050 kT was deposited at lambda=0 (the first state), its free energy estimate was -1.00050 kT. In the log file, the weight of the first state is always shifted to 0, leading to what we see above.)
At 510 steps, the system moved to another point in the sampling space, where is (-2.881022, 0) and the corresponding part of the log file is as follows:
Step Time
510 1.02000
MC-lambda information
N VdwL Count G(in kT) dG(in kT)
1 0.000 43 0.00000 0.95674 <<
2 0.200 7 0.95674 0.00000
3 0.400 1 0.95674 0.00000
4 0.500 0 0.95674 0.00000
5 0.600 0 0.95674 0.00000
6 0.700 0 0.95674 0.00000
7 0.800 0 0.95674 0.00000
8 1.000 0 0.95674 0.00000
In the log file, the documented values of G correspond to the free energy landscape in the lambda direction with all the other CV(s) fixed. Therefore, the values shown above are not affected by which state the system was at the moment. The value 0.95674 is the value of Gaussian at the state point (-2.881022, 0) after the first deposition of the Gaussian centered at (-3.030568, 0). In the 2D case, the values above correspond to the slice of the 2D free energy surface at theta=-2.881022.
As such, the weights are indeed fluctuating since the system is at different state points at different time frames and Gaussians were constantly added. And yes, the discrepancy we saw in the analysis is also reflected by the difference in the count/weight differences between the coupled and uncoupled states in the log files of the two simulations.
Since the parameters behaved as described above, I believe I was only using METAD instead of WL + METAD.
Unfortunately, Pascal Merz (@ptmerz) isn't on this conversation (and left the lab July 1st and is on vacation now), as he did the implementation, but yes,
I believe that the Wang-Landau-related parameters are either deactivated or overwritten Is what is supposed to be happening.
I'm a little confused by the example. Because there are no weights added before 1 ps, I would imagine that PACE=500 (which I confirmed by looking at the plumed.day file). My understanding is that then the weights should not change until 2 ps - that the Gaussian laid down at 1 ps, with height = 1.00050 would remain until 2 ps. But the height changes at step 510 = 1.02 ps. Am I misunderstanding PACE?
Hi @mrshirts, as I mentioned in my previous comment, in the log file, the documented values of G correspond to the free energy landscape in the lambda direction with all the other CV(s) fixed. There was no second Gaussian bias added until 2 ps. The reason that the weight at step 510 is not 1.00050 is that at step 510, the system was at a different place in the sampling space from where it was at step 500. Specifically, here is how the value 0.95674 was calculated:
PACE
was set at 500, the first Gaussian (centered at (-3.030568, 0)) was added. As can be check in plumed.dat
, the Gaussian widths along the configuration CV and lambda directions were 0.5 and 0.0001, respectively.nstexpanded
was set as 10). The system was sampling the configurational space with normal MD.The details of the calculation can be seen here.
Ah, so what the plumed is doing is reporting back to GROMACS the weights at (current_CV,lambda_i), so it will change with each step as the CV changes even if lambda changes. OK, I understand now. That probably should be documented clearly; though I'm not quite sure where that should go in the documentation, probably on the PLUMED side since it is a modification that PLUMED is making to GROMACS.
Yes I agree that this should be documented clearly. We also want to be clear what GROMACS parameters are deactivated or overwritten in alchemical metadynamics, which @ptmerz might the person who knows the best. I'll list writing a tutorial of alchemical metadynamics as a to-do for the project.
@wehs7661 I see this makes sense, and explains why the weights in the log file seems corresponding to the difference that you observe in the result (they are the difference).
Just to be 100% sure, this implies that if you run another run with hills height = 0 the reported weights in the gromacs log will be zero throughout the whole simulation, right?
Personally, I find this a bit confusing. I would have found more intuitive that the user had to tell GROMACS "please do not update weights", and the final weights to be a combination of GROMACS weights (possibly always zero) and PLUMED weights. Conceptually, I see this as the forces acting on the atoms: they are a combination of force field terms (from GROMACS) and bias terms (from PLUMED), the latter not overrididing the former. Anyway, this is a matter of personal taste. I think that using both GROMACS and PLUMED to update weights simultaneously might be a bit tricky (in the analysis), so I am fine with disabling it, as it seems the case now.
Bottom line: I didn't find the problem, I will still search for it.
@mrshirts related to the documentation, in principle this is not related to PLUMED but to the GROMACS patch distributed with PLUMED. We have another case like this: PLUMED adds an implementation of Hamiltonian replica exchange to GROMACS (gmx mdrun -hrex
, that enables to just build multiple tpr without using the slower free-energy framework). This is GROMACS specific, and so does not fit the general PLUMED documentation, and due to that I wrote a separate tutorial here. We could do something similar for this project.
Personally, I find this a bit confusing. I would have found more intuitive that the user had to tell GROMACS "please do not update weights", and the final weights to be a combination of GROMACS weights (possibly always zero) and PLUMED weights.
For the implementation, it was easier to just tell GROMACS not to do anything with the weights.
Conceptually, I see this as the forces acting on the atoms: they are a combination of force field terms (from GROMACS) and bias terms (from PLUMED), the latter not overrididing the former. Anyway, this is a matter of personal taste. I think that using both GROMACS and PLUMED to update weights simultaneously might be a bit tricky (in the analysis), so I am fine with disabling it, as it seems the case now.
The biases are handled a bit differently in the GROMACS expanded ensemble code, so it was very easy just to tell GROMACS "don't calculate the biases, just get them from PLUMED".
Bottom line: I didn't find the problem, I will still search for it.
Thanks! This is very mystifying to us.
@mrshirts related to the documentation, in principle this is not related to PLUMED but to the GROMACS patch distributed with PLUMED. We have another case like this: PLUMED adds an implementation of Hamiltonian replica exchange to GROMACS (
gmx mdrun -hrex
, that enables to just build multiple tpr without using the slower free-energy framework). This is GROMACS specific, and so does not fit the general PLUMED documentation, and due to that I wrote a separate tutorial here. We could do something similar for this project.
Yes, this makes sense as a place to put things. Wei-Tse is working on the text for this.
Just to be 100% sure, this implies that if you run another run with hills height = 0 the reported weights in the gromacs log will be zero throughout the whole simulation, right?
Yes, @GiovanniBussi this is correct. I've just conducted another short test setting HILLS
as 0 and the weights documented in the log file were all 0 throughout the simulation. I've pushed the results to Problem_1/state_A/test_2
if you are interested in checking. (The previous test with nstlog=1
has been renamed as test_1
.)
I might have found the error!
tail */*output.gro
==> state_A/sys2_cccc_output.gro <==
282SOL OW 845 2.072 1.743 0.612 0.1742 0.0275 -0.1960
282SOL HW1 846 2.029 1.789 0.685 1.3405 1.7166 -0.5858
282SOL HW2 847 2.165 1.761 0.624 0.6581 -2.5168 -0.1508
283SOL OW 848 0.304 1.143 0.864 -0.7747 0.5754 -0.6495
283SOL HW1 849 0.320 1.117 0.774 -1.9513 -1.3033 -0.2954
283SOL HW2 850 0.211 1.162 0.868 -0.6949 0.8174 0.1281
284SOL OW 851 0.710 1.729 1.022 -0.2550 -0.3419 -0.2911
284SOL HW1 852 0.704 1.819 0.992 -2.1539 -0.3927 -0.0682
284SOL HW2 853 0.795 1.699 0.991 1.3567 2.2782 1.6888
2.33756 2.33756 1.65290 0.00000 0.00000 0.00000 0.00000 1.16877 1.16877
==> state_B/sys2_cccc_output.gro <==
282SOL OW 845 2.095 0.125 0.811 0.1491 -0.0222 0.1667
282SOL HW1 846 2.024 0.184 0.786 0.1331 -0.4980 -0.9127
282SOL HW2 847 2.139 0.105 0.728 0.7472 -0.7458 0.6670
283SOL OW 848 1.846 0.992 1.133 0.0266 0.1476 -0.2237
283SOL HW1 849 1.883 0.906 1.117 -0.0113 0.7019 -3.1915
283SOL HW2 850 1.753 0.976 1.149 0.1626 -0.4232 0.0073
284SOL OW 851 1.332 0.702 0.039 -0.0407 -0.0890 0.3202
284SOL HW1 852 1.267 0.769 0.063 0.0677 0.5280 -1.0854
284SOL HW2 853 1.365 0.669 0.122 -0.0387 1.6700 1.0161
2.29609 2.29609 1.62358 0.00000 0.00000 0.00000 0.00000 1.14804 1.14804
Boxes are different, and you are running NVT:
grep pcoupl */*expanded.mdp
state_A/sys2_expanded.mdp:pcoupl = no
state_B/sys2_expanded.mdp:pcoupl = no
Can you double check? If this is confirmed, please rerun them with the same box (I guess you can adjust it by hand and then quickly re-equilibrate the particles)
Hi @GiovanniBussi , thanks for catching that! Yes, the simulations were run in NVT ensemble so the box volumes were different. I will rerun the simulations with the same size of the box and repot that I see. Thank you!
Hi @GiovanniBussi , I'm sorry for the late reply! I've been trying to examine the previous simulations carefully and I might have a new problem with the data analysis method. But I think different box volumes were indeed responsible for the majority of the discrepancy we've seen. I'm currently re-organizing/updating Problem 1 and I'll let you know when it's ready!
Several days ago, I've also updated Problem 2 according to my current progress of the research. Problem 2 is mainly about strategies for deciding parameters for 2D alchemical metadynamics simulations and I have opened up a new issue for it so that we can separate the discussions for different problems. If you could have a look at the jupyter notebook in the folder Problem_2
, that would be very helpful. Thanks a lot for your valuable input!
@wehs7661 @mrshirts as usual, questions via GH issues :-)
I am focusing on Problem 1 now. Didn't go through the full analysis yet, but I did some quick check on the results and I agree, they are different and this is unexpected. I mean:
We are in case 2 now.
Little doubt: I am not very familiar with the WL implemented in GROMACS, so I cannot decipher this in the log file:
Does it imply you are using WL as well? I though you were expecting METAD to help sampling lambda, and thus disabled WL (maybe this is what you did... can you confirm?). If you are using WL + METAD, we should take that into account when you do the analysis (it's possible). A quick check in the gromacs manual tells me that
lmc-weights-equil = wl-delta
implies that you are changing the weights until some point. If the weights at that point (and then used in the following part of the simulation) are different in the two simulations, this would explain your weird behavior. By knowing these weights (if they are frozen for a significant part of the simulation) we could easily correct the result.If instead the input implies that you are not using WL, forgive my ignorance on GROMACS input, I will continue to hunt for the reason of the discrepancy somewhere else