undark-lab / swyft

A system for scientific simulation-based inference at scale.
Other
154 stars 13 forks source link

Consolidate plotting routines #137

Open cweniger opened 11 months ago

cweniger commented 11 months ago

Right now we have plot2.py, plot.py, incoherent usage of labels.

cweniger commented 11 months ago

Suggested Plan

Functions

General functionality

Example notebook

NoemiAM commented 11 months ago

@cweniger the plan looks good. I agree that some more user-friendly functionality to plot weighted samples and not only LogRatioSamples is highly needed.

I'm not sure if this is already your plan, but I would suggest plot_pair function to handle all cases with 2d posteriors. So the corner without diagonal, shouldn't be an extension of plot_corner, but an option of plot_pair. Then, if one is interested, e.g., in the correlations of only one specific parameter with many others,plot_pair should plot just a grid and not the full lower triangle.

cweniger commented 11 months ago

@NoemiAM Good suggestion regarding plot_pair. How would you signal that distinction to the routine? One option would be to have [par1, par2, par3, ...] generate a corner-type plot, and [[par1, par2], [par1, par3], [par2, par5], ...] a grid type plot.

Regarding handling weighted samples, one option would be to simply add the value logweight to the swyft.LogRatioSamples object. In most cases it would be set to None, indicating equal weights for all samples, but in the case of MCMC or nested sampling it could be changed accordingly. One annoying thing is that logratios might in some cases actually mean loglike (for instance if we ever include NPE).

NoemiAM commented 11 months ago

@cweniger Re plot_pair, yes it should be handled like we handle "marginals" in ratio estimators. So a long list [par1, par2, par3, ...] should produce the lower triangle corner, and coma-separated lists should produce the corresponding grids. In practice, it might be tricky to write it but it should be doable.

Re LogRatioSamples, I think for now it's enough to add logweight. When we'll add NPE it might be worth to rename the LogRatioSamples object to something more general indicating results/predictions, that will contain logratios, loglike, logweight... depending on the inference task.