Consolidate plotting routines

cweniger commented 11 months ago

Right now we have plot2.py, plot.py, incoherent usage of labels.

Make labels available in all plotting routines.
Subgrid build inside plot_1d (or new function).

cweniger commented 11 months ago

Suggested Plan

Functions

plot_1d: Plots single 1-dim posteriors
plot_2d: Plots single 2-dim posteriors
plot_corner: Plots corner plots with multiple 1-dim and 2-dim posteriors; optionally the diagonal can be omitted (making this similar to plot_pair in Arvin)
plot_posterior: Plot grid of 1-dim posteriors
plot_pair: Plot grid of 2-dim posteriors

General functionality

If available, marginal posteriors will be used to interpolate rather than generate histograms.
Plotting functions should be able to handle weighted prior samples, as they are effectively produced by nested sampling techniques (related to residual volume) or MCMC (effectively the inverse of the likelihood). This requires an extension of swyft.LogRatioSamples to include logv or something similar.
Reasonable defaults

Example notebook

Introduce example notebook with hand-crafted swyft.LogRatioSamples, which are then used to test and demonstrate plotting functionality. This eases tests, and decouples inference and plotting related issues.

NoemiAM commented 11 months ago

@cweniger the plan looks good. I agree that some more user-friendly functionality to plot weighted samples and not only LogRatioSamples is highly needed.

I'm not sure if this is already your plan, but I would suggest plot_pair function to handle all cases with 2d posteriors. So the corner without diagonal, shouldn't be an extension of plot_corner, but an option of plot_pair. Then, if one is interested, e.g., in the correlations of only one specific parameter with many others,plot_pair should plot just a grid and not the full lower triangle.

cweniger commented 11 months ago

@NoemiAM Good suggestion regarding plot_pair. How would you signal that distinction to the routine? One option would be to have [par1, par2, par3, ...] generate a corner-type plot, and [[par1, par2], [par1, par3], [par2, par5], ...] a grid type plot.

Regarding handling weighted samples, one option would be to simply add the value logweight to the swyft.LogRatioSamples object. In most cases it would be set to None, indicating equal weights for all samples, but in the case of MCMC or nested sampling it could be changed accordingly. One annoying thing is that logratios might in some cases actually mean loglike (for instance if we ever include NPE).

NoemiAM commented 11 months ago

@cweniger Re plot_pair, yes it should be handled like we handle "marginals" in ratio estimators. So a long list [par1, par2, par3, ...] should produce the lower triangle corner, and coma-separated lists should produce the corresponding grids. In practice, it might be tricky to write it but it should be doable.

Re LogRatioSamples, I think for now it's enough to add logweight. When we'll add NPE it might be worth to rename the LogRatioSamples object to something more general indicating results/predictions, that will contain logratios, loglike, logweight... depending on the inference task.

undark-lab / swyft