MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.32k stars 652 forks source link

change nucleicacids results dict #3744

Closed orbeckst closed 1 year ago

orbeckst commented 2 years ago

Is your feature request related to a problem?

The new experimental nucleicacids analysis module contains the https://github.com/MDAnalysis/mdanalysis/blob/c01d82bb86b480c8b2ae87f36497e7fc0a726c84/package/MDAnalysis/analysis/nucleicacids.py#L61 class as its base class. It presents results in a Results dict. The content is different from what other AnalysisBase-based classes use in that

Specifically, for each of the selections, a dict is stored https://github.com/MDAnalysis/mdanalysis/blob/c01d82bb86b480c8b2ae87f36497e7fc0a726c84/package/MDAnalysis/analysis/nucleicacids.py#L134-L135 and res_dict comes from https://github.com/MDAnalysis/mdanalysis/blob/c01d82bb86b480c8b2ae87f36497e7fc0a726c84/package/MDAnalysis/analysis/nucleicacids.py#L128-L129 where dist is an array of distances.

The "selections" are the individual base pairs so in order to query the distance of base pair j at time frame t I would use results[j][t].

Please correct me, @ALescoulie , if I misrepresented anything here.

Describe the solution you'd like

@IAlibay 's suggestion (from https://github.com/MDAnalysis/mdanalysis/pull/3735#pullrequestreview-1020755173) :

would it make sense to consider switching this to instead be results.pair_distances (maybe even making it a pairs x times 2D ndarray?

Additionally, I suggest to transpose so that the first axis corresponds to time frame and the second axis to pair index:

results.pair_distances[:, j]    # time series of pair j
results.pair_distances[-1]      # all distances at the last time frame

The "one observation per row" layout is canonical and also used e.g. in the RMSD results.

Describe alternatives you've considered

Don't change anything... but this makes this module behave and feel different from the others and tools such as mdacli might have a hard time working with it — comments by @PicoCentauri and @joaomcteixeira especially welcome on how to best structure results.

Additional context

IAlibay commented 2 years ago

I'm moving this to target 2.4.0 instead.