choderalab / yank

An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.
http://getyank.org
MIT License
181 stars 71 forks source link

How to analyze subsets of the simulation #792

Open elkhoury opened 7 years ago

elkhoury commented 7 years ago

We would like to analyze a subset of the simulation (e.g. 1000 iterations out of 20000 iterations) in order to have the calculated free energies and the convergence plots.

Is there a way to automatically analyze subsets of the data?

Thank you,

Léa El Khoury

(@davidlmobley, @renm1)

jchodera commented 7 years ago

This sounds like something we should be able to easily add to the CLI for analysis. We already support --start XXX and --end YYY for trajectory analysis, so we'd just have to pass these on to yank.analysis.analyze_directory() and modify this code to handle that.

Lnaden commented 7 years ago

Should be easy enough. There is no public API way to do this right now, but you can follow the logic of the analyze_directory function in the source code @jchodera linked to piece together a temporary solution.

@elkhoury by "convergence plots" do you mean in the Jupyter Notebooks plots from yank analyze report?

elkhoury commented 7 years ago

Yes, I mean the convergence plots generated in the Jupyter Notebooks.

Thank you @jchodera, @lnaden

andrrizzi commented 6 years ago

The Analyzer classes now have a Analyzer.max_n_iterations property that you can use to limit the number of iterations you want to analyze (implemented in #915). There's still no hook through the CLI for this, but for now, you can add

analyzer_kwargs['max_n_iterations'] = 1000

in the first cell of the Jupyter notebook.