MDAnalysis / PathSimAnalysis

An MDAKit that calculates the geometric similarity of molecular dynamics trajectories using path metrics such as the Hausdorff or Fréchet distances.
https://mdanalysis.org/PathSimAnalysis
GNU General Public License v2.0
0 stars 0 forks source link

psa.PSAnalysis code saves files and doesn't label trajectories in plots #9

Open lilyminium opened 4 years ago

lilyminium commented 4 years ago

Expected behavior

1. PSAnalysis.plot() produces similarly labelled plots as in PSAnalysisTutorial as the only output: Screenshot 2020-01-22 at 11 49 11 PM

Same with plot_annotated_heatmap.

2. It doesn't save files everywhere i.e. There are no files that always get saved, and in functions with a save or store keyword, the default is not True. Moreover, if I pass a filename to save to, the file gets saved there, rather than in a subdirectory.

3. Clustering isn't done for every plot It just a bit expensive to do it for both plotting functions instead of saving the results into the class, especially if you're twiddling with plot parameters like color maps or figure sizes.

Actual behavior

1. It actually returns some arrays and an unlabelled heat map. Screenshot 2020-01-22 at 11 48 03 PM

Maybe matplotlib updated, because setting tick labels on the minor ticks just doesn't work with the code here. (ticks show up with minor=False)

https://github.com/MDAnalysis/mdanalysis/blob/972a8b1893b5e353b9d796b15d6fcdd022a93f00/package/MDAnalysis/analysis/psa.py#L1779-L1780

https://github.com/MDAnalysis/mdanalysis/blob/972a8b1893b5e353b9d796b15d6fcdd022a93f00/package/MDAnalysis/analysis/psa.py#L1787-L1791

Or maybe it was decided the plot shouldn't be labelled, because the labels are then turned off immediately after. Personally I think they're necessary as the clustering rearranges the trajectories.

https://github.com/MDAnalysis/mdanalysis/blob/972a8b1893b5e353b9d796b15d6fcdd022a93f00/package/MDAnalysis/analysis/psa.py#L1798-L1809

It would also be convenient if the actual figure or axes got returned, so users could modify plots further.

2.Files get saved everywhere

Some data gets dumped into pickle files immediately. The class also defines subdirectories to save things into. For example, plots get saved into './plots/my_actual_chosen_filename.pdf'.

https://github.com/MDAnalysis/mdanalysis/blob/972a8b1893b5e353b9d796b15d6fcdd022a93f00/package/MDAnalysis/analysis/psa.py#L1361-L1364

generate_paths has save=True, store=True on by default that saves even more files.

3. Clustering is done for every plot

While it's pretty necessary for the dendrogram, users may also want to plot an annotated heatmap without clustering.

Code to reproduce the behavior

Notebook

Plotting part, specifically

Currently version of MDAnalysis

orbeckst commented 4 years ago

@sseyler do you have any comments?

RMeli commented 4 years ago

Just a side comment: what about using sns.clustermap instead of sns.heatmap in plot_annotated_heatmap?

orbeckst commented 4 years ago

This seems to require a major clean up — I think this can wait until 1.0 is done.