Support grabbing the label (and group?) for each `subcoordinate_y` plot from the NdOverlay key

droumis commented 10 months ago

As @philippjfr described here, HoloViews should support grabbing the label for each subcoordinate_y plot from the NdOverlay key.

so something like this should just work:

import numpy as np
import holoviews as hv; hv.extension('bokeh')
from scipy.stats import gaussian_kde

categories = ['A', 'B', 'C', 'D', 'E']
data = {cat: np.random.normal(loc=i-2, scale=1.0, size=100) for i, cat in enumerate(categories)}
x = np.linspace(-5, 5, 100)

areas = {}
for i, (cat, values) in enumerate(data.items()):
    pdf = gaussian_kde(values)(x)

    area = hv.Area((x, pdf)).opts( # Not having to add a label per subplot
        subcoordinate_y=True, 
        subcoordinate_scale=1.5,
    )
    areas[cat] = area

ridge_plot_areas = hv.NdOverlay(areas).opts(
    width=900,
    height=400,
)

And produced something like this

droumis commented 8 months ago

Just checking which of my issues are still valid.

This is still a valid issue.

droumis commented 6 months ago

I'm closing this.. I am no longer convinced that this is obvious behavior.. For instance.. what if the key in the dict conflicts with a provided label? I don't think it's that much more work to just set the label per element

philippjfr commented 6 months ago

I do think this is important, often you will create this view by grouping a dataset by some dimension and those operations will produce an NdOverlay with the values along that dimension (e.g. channels). Adding labels would be pretty cumbersome.

droumis commented 6 months ago

I'm trying to think through the implementation of this a bit more.

My interpretation of what you mean by 'grouping a dataset by some dimension' is something like calling .overlay, correct?

Here is one possible simplified implementation, which of course doesn't yet work with subcoordinate_y because we cannot yet implicitly set the label are per curve.

import numpy as np
import holoviews as hv
import xarray as xr
hv.extension('bokeh')

n_channels = 10
n_seconds = 5
total_samples = 256*n_seconds

time = np.linspace(0, n_seconds, total_samples)
data = np.random.randn(n_channels, total_samples).cumsum(axis=1)
channels = [f"EEG {i}" for i in range(n_channels)]

data_xr = xr.DataArray(data, dims=['channel', 'time'], coords={'channel': channels, 'time': time}, name='value')
curves = hv.Dataset(data_xr).to(hv.Curve, 'time', 'value', 'channel').overlay('channel').opts(
    hv.opts.Curve(
        tools=['hover'],
        # subcoordinate_y=True # Currently requires a unique label per item
    ),
    hv.opts.NdOverlay(
        responsive=True,
        aspect=3,
    )
)
curves

I guess in this case, the fix would be for NdOverlay to grab the channel value as the label.

But then if we are trying to avoid for loops entirely, what would be a good approach to setting the group per Curve?

I tried the following but HoloViews doesn't support such complexity:

import numpy as np
import xarray as xr
import holoviews as hv
hv.extension('bokeh')

n_channels = 10
n_seconds = 5
total_samples = 256 * n_seconds
groups = ['A', 'B', 'C']

time = np.linspace(0, n_seconds, total_samples)
data = np.random.randn(n_channels, total_samples).cumsum(axis=1)
channels = [f"EEG {i}" for i in range(n_channels)]

channel_groups = [groups[i % len(groups)] for i in range(n_channels)]

data_xr = xr.DataArray(
    data,
    dims=['channel', 'time'], 
    coords={
        'channel': channels, 
        'time': time,
        'group': ('channel', channel_groups)
    },
    name='value'
)

curves = hv.Dataset(data_xr).to(hv.Curve, 'time', 'value', ['channel', 'group']).overlay('channel').opts(
    hv.opts.Curve(
        tools=['hover'],
    ),
    hv.opts.NdOverlay(
        responsive=True,
        aspect=3,
    )
)

curves

It would be really useful to see (at least pseudocode) of what you think would be a good API

holoviz / holoviews

Support grabbing the label (and group?) for each `subcoordinate_y` plot from the NdOverlay key #6028