Open lilyminium opened 4 years ago
That’s a neat idea. It’s also fairly localized so could even be labelled “experimental” for a start until we figure out under which circumstances it breaks.
I also like it because with ChainReader and Ensemble we would be supplying functionality to aggregate existing data in a way that makes analysis simpler.
@alescoulie did you want to share your thoughts on ensemble analysis here?
I've been experimenting with ensemble analysis and was able to make a decorator that creates a wrapper of analysis base classes to accept an ensemble group I created around just storing a dictionary of universes, and extending some of the universe functionality to that group. It's a relatively simple solution compared to what @lilyminium has been working on. I've started working on a PR based on this for MDAnalysis.
Is your feature request related to a problem?
I think a lot of people simulate in replicate and need to account for that, calculating PCA for a shared subspace or clustering over an ensemble, etc. It's always a bit tedious tracking which frame is from where. Also, analyses like the encore library operate on ensembles. Encore deals with this by merging them into a new Universe and pulling all the frames into memory, sometimes multiple times. This gets prohibitively large quite quickly, especially when you only care about a small selection of atoms (e.g. alpha-carbons) in your full solvated universe.
Describe the solution you'd like
Have a Universe container that holds Universes and has convenience methods for iterating over selected frames and selecting atoms from each Universe, allowing for different atom indices from each.
Describe alternatives you've considered
Additional context
@orbeckst once noted that analysis functions can't use
FrameIterators
so I overengineered a container that could be used in place of aUniverse.trajectory
in AnalysisBase classes.