[WIP] Adding ESS Plot - Githubissues

imperorrp commented 2 months ago

Adding ESS plot (#5 )

Currently implemented for kind='local' only, using the new scatter_xy visual element function also added as part of this commit. The ess data ('y' values) obtained from the ess statistical computation via Arviz-Stats is combined with xdata ('x' values) generated via np.linspace (like the legacy ess plot in old Arviz does) after this xdata is broadcasted to fit the shape of the ess data. These are concatenated along a new plot_axis dimension (coords 'x' and 'y') which the scatter_xy visual element function then splits and plots accordingly.

Outputs:

azp.plot_ess(data)

azp.plot_ess(data, var_names=["mu", "tau"])

📚 Documentation preview 📚: https://arviz-plots--58.org.readthedocs.build/en/58/

codecov-commenter commented 2 months ago

Codecov Report

Attention: Patch coverage is 94.05941% with 12 lines in your changes missing coverage. Please review.

Project coverage is 85.53%. Comparing base (f4a39af) to head (1ca070c).

Files with missing lines	Patch %	Lines
src/arviz_plots/plots/essplot.py	94.17%	11 Missing :warning:
src/arviz_plots/visuals/__init__.py	91.66%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #58 +/- ## ========================================== + Coverage 84.80% 85.53% +0.73% ========================================== Files 21 22 +1 Lines 2336 2538 +202 ========================================== + Hits 1981 2171 +190 - Misses 355 367 +12 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

imperorrp commented 1 month ago

Fixed the docstring switch-up for type 'local' and 'quantile' and incorporated other suggestions including the x aesthetic addition for the 'model' dimension. I made a temporary addition to scatter_xy to add the extra x arg passed due to this new aesthetic. The output ends up looking like this though-

pc = azp.plot_ess(
        {"centered": data, "non centered": data2}, var_names=["mu", "tau"]
    )
pc.add_legend("model")

The default x arg value generated for the second model is '1', which makes the data get severely skewed out of the original x axis range. Should we hardcode some smaller 'x' values with something like pc_kwargs['x'].setdefault(np.linspace(0, 0.01, 9)) maybe?

imperorrp commented 1 month ago

Updated x aesthetic mapping for multiple-model cases:

The logic followed is as below. The x_diff is calculated, and currently one-third of that is taken as the range within which points of different models can be plotted. np.linspace then ensures an even distribution of the points of whatever number of models to plot within the aforementioned range.

  # setting x aesthetic to np.linspace(-x_diff/3, x_diff/3, length of 'model' dim)
  # x_diff = span of x axis (1) divided by number of points to be plotted (n_points)
  x_diff = 1 / n_points
  if "x" not in pc_kwargs:
      pc_kwargs["x"] = np.linspace(-x_diff / 3, x_diff / 3, distribution.sizes["model"])
  pc_kwargs["aes"].setdefault("x", ["model"])

And quantile plots added-

Labelling, along the y axes ('ESS for small intervals' for kind='sample', 'ESS for quantiles' for type='quantile') and x axis ('quantile') has also been added

imperorrp commented 1 month ago

Added rugplot to plot_ess.

Like plot_trace, I've set the 'overlay' aesthetic for the 'chain' dimension but is ignored with .difference() when the default aes_map is generated for the local and quantile artists.

Also made a modification to the trace_rug visual element to add a new arg scale, which helps in controlling the xvalues range. It is set to the length of the draw dimension so that the xvalues range is kept between 0-1.

Plot output:

azp.plot_ess(data, var_names=["mu", "tau"], rug=True)

imperorrp commented 1 month ago

To do: Apply rankdata function to the data once available in Arviz-Stats for proper rug generation.

imperorrp commented 1 month ago

Just made updates. This takes into account #66 and plots the mean, sd and min_ess as well using line_xy. Users can also set custom text for xlabel and ylabel now:

Annotating these lines is left as well as fixing an issue when the model dimension exists, where the output is like this:

imperorrp commented 1 month ago

Plot_ess output is now like this:

Each of 'mean', 'sd', and 'min_ess' has their own linestyle to distinguish between them. Maybe having another legend for them could be of help though as it might not be evident to a user which is which unless the the linestyle cycle and order of plotting is known by looking at the source code.

The 'min_ess' line is set to gray by default and 'mean' and 'sd' to the first color in the color cycle by default. If the 'model' dim is present, then all three elements get the same color for a model.

OriolAbril commented 4 weeks ago

I think annotating the lines is he only way to clearly indicate which is which. If you set extra_methods=True in current arviz, you'll see the output looks like this:

imatge

I would add these two annotations as visual elements here too

imperorrp commented 4 weeks ago

I think annotating the lines is he only way to clearly indicate which is which. If you set extra_methods=True in current arviz, you'll see the output looks like this:

I would add these two annotations as visual elements here too

Todo: Add a new visual element for these annotations

imperorrp commented 3 weeks ago

Rebased essplot commits

imperorrp commented 3 weeks ago

Added new annotate_xy visual element with in-built logic to determine whether to vertically align the intended annotating text top or bottom based on an extra_da arg passed to it. This helps when two lines like mean and sd both are being plotted. Defaults for them both are also set- 'bottom' for mean and 'top' for sd.

New plot_kwargs keys mean_text and sd_text are also added and their default aesthetics from the aes_maps for mean and sd are also used for these if not set.

Outputs:

imperorrp commented 1 week ago

Some examples to the docstring and to the example gallery for plot_ess were added. Documentation seems to build fine when tried locally except with a warning, and this seems to be the case in the doc building log here too. There are lots of NaN is not an array.
Tests were added in test_plots.py and test_hypothesis_plots.py. In the former, testing with the datatree_sample fixture causes the ess computing function from Arviz-Stats to raise an error due to number of sample_dims not being equal to 2, which it seems to require. In the latter, a couple of errors are being raised that have to be looked into. One is with the time limit being exceeded.

imperorrp commented 1 week ago

Fixed hypothesis test errors for ESS plot

imperorrp commented 1 day ago

Rebased the PR

imperorrp commented 21 hours ago

Made those modifications. Also, pylint was indicating an error regarding the number of tests under the TestPlots class in testplots.py so I raised the limit from 20 to 25.

The output is like this right now though:

^All 10 subplots in one row despite the print statements indicating the figsizing logic computes 2 rows and sticks to the 5 max subplots per row constraint. This has to be looked into if I'm missing something.

arviz-devs / arviz-plots

[WIP] Adding ESS Plot #58

Outputs:

Codecov Report

Plot output:

Outputs: