markovmodel / PyEMMA

🚂 Python API for Emma's Markov Model Algorithms 🚂
http://pyemma.org
GNU Lesser General Public License v3.0
307 stars 118 forks source link

Is TRAM appropriate in this setting? #1518

Closed onehalfatsquared closed 2 years ago

onehalfatsquared commented 2 years ago

Hello,

I am constructing MSMs to get a coarse grained representation of my system's dynamics at various values of the system parameters; in this case, temperature. I have constructed an MSM using the data at each temperature separately, simply using estimate_markov_model(), which is working as expected. I figured that since I have all this data at many temperatures, maybe TRAM would help give better estimates than considering each temperature in isolation.

I ran TRAM by calling the estimate_multi_temperature() function and compared the resulting MSM by plotting the occupation probability of a desired target state as a function of time. I did this for the original MSM and TRAM MSM, and compared to the simulation data for reference. I find that the original MSM does a decent job of capturing the dynamics, but the TRAM MSM is quite off. I'll include the plot here for reference:

tram_result_lowT

Does this result make sense or is it possible I'm doing something wrong in my TRAM call? These are constructed using the same lag time. The only difference I can think of is that in the TRAM call I had to set connectivity='summed_count_matrix', which the documentation says is not recommended. If I did not do this, the step of determining the connected set stalls, which maybe points toward the issue here?

More broadly, would my situation be a good use case for TRAM? I am interested in every temperature ensemble, not just a single "unbiased" ensemble.

Thanks for any help!

clonker commented 2 years ago

ping @Olllom and @MaaikeG

clonker commented 2 years ago

It might be that your thermodynamic states don't overlap, in which case the summed_count_matrix is not appropriate.

onehalfatsquared commented 2 years ago

Thank you for the reply. I am not entirely sure how to check if there is sufficient overlap between thermodynamic states, but I do know that the highest temperature ensemble produces a reversible MSM over the full set of discrete states. The lowest temperature data, however, does not sample escapes from a handful of states. If I understand correctly, the states overlap here because there is a reversible path between any two discrete states in the high temperature ensemble?

I did repeat my calculations by setting connectivity='reversible_pathways', which doesn't stall for me when computing the connected set. The entries of the new transition matrix are the same as my previous calculation, up to machine precision, and the plot I generated remains the same. Is this enough to rule out overlap as the problem here?

clonker commented 2 years ago

ping again @Olllom and @MaaikeG

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.