larray-project / larray

N-dimensional labelled arrays in Python
https://larray.readthedocs.io/
GNU General Public License v3.0
8 stars 6 forks source link

grouped and stacked bar plots #1082

Open gdementen opened 1 year ago

gdementen commented 1 year ago

Pandas (and thus larray) currently offers either grouped or stacked bar plots. It would be nice to have an option for having both at the same time. We can workaround the Pandas issue like in: https://stackoverflow.com/questions/59922701/how-can-i-group-a-stacked-bar-chart

we could offer the functionality out-of-the-box in larray.

Here is some preliminary code following the same logic I did for Bernhard:

def grouped_stacked_bar(arr, stack=-1, cmap='tab20', **kwargs):
    colors = plt.get_cmap(cmap).colors
    stack_axis = arr.axes[stack]

    chunk = arr.sum(stack_axis)
    colors_idx = 0
    ax = chunk.plot.bar(color=colors[colors_idx:colors_idx+len(chunk)])
    colors_idx += len(chunk)
    for i in range(1, len(stack_axis)):
        chunk = arr.sum(stack_axis.i[i:])
        chunk.plot.bar(ax=ax, color=colors[colors_idx:colors_idx+len(chunk)], **kwargs)
        colors_idx += len(chunk)
    axes = AxisCollection(stack_axis).union(chunk.axes[0])
    legend_labels = [' '.join(str(label) for label in comb) for comb in axes.iter_labels()]
    ax.legend(legend_labels)
    return ax

arr = ndtest((3, 4, 5))
grouped_stacked_bar(arr, stack='a')
grouped_stacked_bar(arr, stack='b')

When he saw the result, Bernhard told me he would rather have the color depend only on the stack axis, which would make the color handling code much easier, but would require a better multi-level tick label handling.

gdementen commented 10 months ago

Here is a version with the colors only by the stack axis, and corresponding legend fiddling. Fixing the tick labels is probably going to be harder. Maybe we should bypass pandas entirely. See https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html

def grouped_stacked_bar(arr, stack=-1, cmap='tab20', **kwargs):
    colors = plt.get_cmap(cmap).colors
    stack_axis = arr.axes[stack]
    grouped_axes = (arr.axes - stack_axis)[:-1]
    num_groups = grouped_axes.size
    chunk = arr.sum(stack_axis)
    # plot total bars
    ax = chunk.plot.bar(color=colors[0], **kwargs)
    # plot each, without one more label in turn
    # this places the first label on top
    for i in range(1, len(stack_axis)):
        chunk = arr.sum(stack_axis.i[i:])
        chunk.plot.bar(ax=ax, color=colors[i], **kwargs)
    # ax.legend() is equivalent to h, l = ax.get_legend_handles_labels(); ax.legend(h, l)
    legend_handles, legend_labels = ax.get_legend_handles_labels()
    legend_handles = legend_handles[::num_groups]
    ax.legend(legend_handles, stack_axis.labels)
    # ax.minorticks_on()
    return ax

arr = ndtest((3, 4, 5))
# grouped_stacked_bar(arr, stack='a')
grouped_stacked_bar(arr, stack='b')