Arcadia-Science / arcadia-pycolor

tools for using the Arcadia palette in Python
MIT License
4 stars 0 forks source link

Gradient color interpolation #41

Closed ekiefl closed 3 weeks ago

ekiefl commented 1 month ago

I'm using nglview and would like to register custom color gradients for coloring protein surfaces. Unfortunately, you can't pass a matplotlib colormap or a custom gradient definition like in plotly ([(c1, v1), ...]). This means I need more granular control, and must interpolate colors across the gradient based on a value range.

I think this could be part of the Gradient class. Something like:

def interpolate_color(
    self, values: Iterable[float], min_value: float, max_value: float
) -> list[float]:
    if not all(min_value <= value <= max_value for value in values):
        raise ValueError("Value must be within the range [min_value, max_value].")

    cmap = self.to_mpl_cmap()
    return [mcolors.to_hex(cmap((value - min_value) / (max_value - min_value))) for value in values]

Semantically, I guess it should really return a list of HexCode objects, but practically I think all use cases would want the hex strings directly.

I guess it would also be better to snap any values outside the range to the closest extremum.

mezarque commented 1 month ago

Thanks for this suggestion! I'm curious whether this needs to be within the package, or whether the mapping should instead exist be managed outside of this package, in the code that you're using to interface with nglviewer.

If you're looking for a list of HexCode objects that form a set number of bins within your defined gradient, you can resample the gradient using Gradient.resample_as_palette(steps). This returns a Palette of HexCodes with a number of colors equal to steps, which you can then map onto your ranges using zip or an equivalent function.

ekiefl commented 1 month ago

I'm curious whether this needs to be within the package, or whether the mapping should instead exist be managed outside of this package, in the code that you're using to interface with nglviewer

As the person suggesting this feature, I'll be the first to say it doesn't need to exist in this package. But I believe sampling a color from a gradient based on a value range is the meat and potatoes of what a gradient is used for. That's what happens when we pass an argument for cmap= in matplotlib, and the operation is so ubiquitous that it's even implemented as the very __call__ method of matplotlib's Colormap.

While I can implement this as a utility in my own code, I don't see my use case as bespoke. I see it as belonging to a broad class of use cases when users can't hand over the responsibility of mapping values to colors to packages like matplotlib and seaborn, which directly ingest the Colormap generated by to_mpl_cmap. Including it in my code feels like it is fulfilling a responsibility that ought to be fulfilled by our color package.

If you're looking for a list of HexCode objects that form a set number of bins within your defined gradient

That's not quite what I'm trying to do. I'm not looking to sample the color gradient into a discrete palette, I'm looking to map values to the color continuum defined by the gradient.

ekiefl commented 1 month ago

For the time being, I've implemented the following color utility in my package. If we want to include parts of it for arcadia-pycolor, here it is.

color.py:

from collections.abc import Sequence

import arcadia_pycolor as apc
import matplotlib.colors

def map_values_to_gradient(
    gradient: apc.Gradient,
    values: Sequence[float],
    min_value: float | None = None,
    max_value: float | None = None,
) -> list[str]:
    """Map a sequence of values to their corresponding colors from a gradient

    Args:
        min_value:
            Determines which value corresponds to the first color in the spectrum.
            Values less than this are given this color. If not provided, min(values) is
            chosen.
        max_value:
            Determines which value corresponds to the last color in the spectrum. Values
            greater than this are given this color. If not provided, max(values) is
            chosen.

    Returns:
        A list of hex code strings.

    Note:
        - This should perhaps be the responsibility of arcadia_pycolor
          (https://github.com/Arcadia-Science/arcadia-pycolor/issues/41)
    """

    if not len(values):
        return []

    if min_value is None:
        min_value = min(values)

    if max_value is None:
        max_value = max(values)

    if min_value > max_value:
        raise ValueError(f"max_value ({max_value}) must be greater than min_value ({min_value}).")

    cmap = gradient.to_mpl_cmap()

    if min_value == max_value:
        # Value range is 0. Return the midrange color for each value.
        return [matplotlib.colors.to_hex(cmap(0.5))] * len(values)

    normalized_values = [(value - min_value) / (max_value - min_value) for value in values]
    clamped_values = [max(0.0, min(1.0, value)) for value in normalized_values]

    return [matplotlib.colors.to_hex(cmap(value)) for value in clamped_values]

test_color.py:

import arcadia_pycolor as apc
import pytest

from sip.visualization.color import map_values_to_gradient

@pytest.fixture
def gradient() -> apc.Gradient:
    return apc.Gradient("_", [apc.black, apc.white], [0.0, 1.0])

@pytest.mark.parametrize(
    "values, expected_colors",
    [
        ([0, 1], ["#000000", "#ffffff"]),
        ([0, 0.5, 1], ["#000000", "#808080", "#ffffff"]),
        ([1, 2, 3, 4, 5], ["#000000", "#404040", "#808080", "#c0c0c0", "#ffffff"]),
        ([-1, 0, 1], ["#000000", "#808080", "#ffffff"]),
        ([], []),
    ],
)
def test_map_values_to_gradient_basic_cases(
    gradient: apc.Gradient,
    values: list[float],
    expected_colors: list[str],
):
    assert map_values_to_gradient(gradient, values) == expected_colors

@pytest.mark.parametrize(
    "values, min_value, max_value, expected_colors",
    [
        ([0, 0.5, 1], 0, 1, ["#000000", "#808080", "#ffffff"]),
        ([0, 0.5, 1], 0.25, 0.75, ["#000000", "#808080", "#ffffff"]),
        ([-1, 0.5, 2], 0, 1, ["#000000", "#808080", "#ffffff"]),
        ([0, 10], 0, 20, ["#000000", "#808080"]),
        ([0, 10], 0, 0, ["#808080", "#808080"]),
    ],
)
def test_map_values_to_gradient_custom_ranges(
    gradient: apc.Gradient,
    values: list[float],
    min_value: float,
    max_value: float,
    expected_colors: list[str],
):
    assert map_values_to_gradient(gradient, values, min_value, max_value) == expected_colors

def test_map_values_to_gradient_invalid_cases(gradient: apc.Gradient):
    # You can't pass min larger than max
    with pytest.raises(ValueError, match="must be greater than"):
        map_values_to_gradient(gradient, [0, 1], min_value=1, max_value=0)
mezarque commented 1 month ago

Thanks for sharing this! I think I better understand what you meant before; we can definitely include this functionality in future versions. If you wanted to open a PR including this code, I'm happy to review it.

mezarque commented 3 weeks ago

Incorporated by #43