plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
15.94k stars 2.53k forks source link

color_continuous_scale incompatible with marginal distributions in density_heatmap #4377

Open MoustHolmes opened 10 months ago

MoustHolmes commented 10 months ago

I was trying out the examples from https://plotly.com/python/2D-Histogram/ and wanted to use the density heat map with marginal distribution on x and y but wanted to change the colour map to align with some previous work. I found when combining the two subsequent examples from the 2D-Histogram page I got an error. If the marginal distribution and color_continuous_scale aren't meant to be used together there should at least be a better error message. I btw also test other kinds of distributions like box, violin and rug with the same result.

plotly Version: 5.17.0

My code:

import plotly.express as px
df = px.data.tips()

fig = px.density_heatmap(df, x="total_bill", y="tip", nbinsx=20, nbinsy=20, marginal_x="histogram", marginal_y="histogram", color_continuous_scale="Viridis")
fig.show()

Traceback:

ValueError                                Traceback (most recent call last)
[/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/attention_weight_event_viewer.ipynb](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/attention_weight_event_viewer.ipynb) Cell 10 line 4
      [1](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a226865705f475055227d/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/attention_weight_event_viewer.ipynb#Y130sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) import plotly.express as px
      [2](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a226865705f475055227d/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/attention_weight_event_viewer.ipynb#Y130sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1) df = px.data.tips()
----> [4](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a226865705f475055227d/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/attention_weight_event_viewer.ipynb#Y130sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3) fig = px.density_heatmap(df, x="total_bill", y="tip", nbinsx=20, nbinsy=20, marginal_y="histogram", color_continuous_scale="Viridis")
      [5](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a226865705f475055227d/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/attention_weight_event_viewer.ipynb#Y130sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4) fig.show()

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/express/_chart_types.py:187](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/express/_chart_types.py:187), in density_heatmap(data_frame, x, y, z, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, hover_name, hover_data, animation_frame, animation_group, category_orders, labels, orientation, color_continuous_scale, range_color, color_continuous_midpoint, marginal_x, marginal_y, opacity, log_x, log_y, range_x, range_y, histfunc, histnorm, nbinsx, nbinsy, text_auto, title, template, width, height)
    145 def density_heatmap(
    146     data_frame=None,
    147     x=None,
   (...)
    180     height=None,
    181 ) -> go.Figure:
    182     """
    183     In a density heatmap, rows of `data_frame` are grouped together into
    184     colored rectangular tiles to visualize the 2D distribution of an
    185     aggregate function `histfunc` (e.g. the count or sum) of the value `z`.
    186     """
--> 187     return make_figure(
    188         args=locals(),
    189         constructor=go.Histogram2d,
    190         trace_patch=dict(
    191             histfunc=histfunc,
    192             histnorm=histnorm,
    193             nbinsx=nbinsx,
    194             nbinsy=nbinsy,
    195             xbingroup="x",
    196             ybingroup="y",
    197         ),
    198     )

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/express/_core.py:2256](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/express/_core.py:2256), in make_figure(args, constructor, trace_patch, layout_patch)
   2251         group[var] = 100.0 * group[var] [/](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/) group_sum
   2253 patch, fit_results = make_trace_kwargs(
   2254     args, trace_spec, group, mapping_labels.copy(), sizeref
   2255 )
-> 2256 trace.update(patch)
   2257 if fit_results is not None:
   2258     trendline_rows.append(mapping_labels.copy())

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:5141](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:5141), in BasePlotlyType.update(self, dict1, overwrite, **kwargs)
   5139         BaseFigure._perform_update(self, kwargs, overwrite=overwrite)
   5140 else:
-> 5141     BaseFigure._perform_update(self, dict1, overwrite=overwrite)
   5142     BaseFigure._perform_update(self, kwargs, overwrite=overwrite)
   5144 return self

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:3921](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:3921), in BaseFigure._perform_update(plotly_obj, update_obj, overwrite)
   3915 validator = plotly_obj._get_prop_validator(key)
   3917 if isinstance(validator, CompoundValidator) and isinstance(val, dict):
   3918 
   3919     # Update compound objects recursively
   3920     # plotly_obj[key].update(val)
-> 3921     BaseFigure._perform_update(plotly_obj[key], val)
   3922 elif isinstance(validator, CompoundArrayValidator):
   3923     if plotly_obj[key]:
   3924         # plotly_obj has an existing non-empty array for key
   3925         # In this case we merge val into the existing elements

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:3942](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:3942), in BaseFigure._perform_update(plotly_obj, update_obj, overwrite)
   3939                 plotly_obj[key] = val
   3940         else:
   3941             # Assign non-compound value
-> 3942             plotly_obj[key] = val
   3944 elif isinstance(plotly_obj, tuple):
   3946     if len(update_obj) == 0:
   3947         # Nothing to do

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:4876](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:4876), in BasePlotlyType.__setitem__(self, prop, value)
   4872         self._set_array_prop(prop, value)
   4874     # ### Handle simple property ###
   4875     else:
-> 4876         self._set_prop(prop, value)
   4877 else:
   4878     # Make sure properties dict is initialized
   4879     self._init_props()

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:5220](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:5220), in BasePlotlyType._set_prop(self, prop, val)
   5218         return
   5219     else:
-> 5220         raise err
   5222 # val is None
   5223 # -----------
   5224 if val is None:
   5225     # Check if we should send null update

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:5215](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/plotly/basedatatypes.py:5215), in BasePlotlyType._set_prop(self, prop, val)
   5212 validator = self._get_validator(prop)
   5214 try:
-> 5215     val = validator.validate_coerce(val)
   5216 except ValueError as err:
   5217     if self._skip_invalid:

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/_plotly_utils/basevalidators.py:1374](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/_plotly_utils/basevalidators.py:1374), in ColorValidator.validate_coerce(self, v, should_raise)
   1372     validated_v = self.vc_scalar(v)
   1373     if validated_v is None and should_raise:
-> 1374         self.raise_invalid_val(v)
   1376     v = validated_v
   1378 return v

File [~/miniconda3/envs/icet2/lib/python3.8/site-packages/_plotly_utils/basevalidators.py:287](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a226865705f475055227d.vscode-resource.vscode-cdn.net/groups/icecube/moust/work/IceCubeEncoderTransformer/notebooks/~/miniconda3/envs/icet2/lib/python3.8/site-packages/_plotly_utils/basevalidators.py:287), in BaseValidator.raise_invalid_val(self, v, inds)
    284             for i in inds:
    285                 name += "[" + str(i) + "]"
--> 287         raise ValueError(
    288             """
    289     Invalid value of type {typ} received for the '{name}' property of {pname}
    290         Received value: {v}
    291 
    292 {valid_clr_desc}""".format(
    293                 name=name,
    294                 pname=self.parent_name,
    295                 typ=type_str(v),
    296                 v=repr(v),
    297                 valid_clr_desc=self.description(),
    298             )
    299         )

ValueError: 
    Invalid value of type 'builtins.str' received for the 'color' property of histogram.marker
        Received value: 'V'

    The 'color' property is a color and may be specified as:
      - A hex string (e.g. '#ff0000')
      - An rgb/rgba string (e.g. 'rgb(255,0,0)')
      - An hsl/hsla string (e.g. 'hsl(0,100%,50%)')
      - An hsv/hsva string (e.g. 'hsv(0,100%,100%)')
      - A named CSS color:
            aliceblue, antiquewhite, aqua, aquamarine, azure,
            beige, bisque, black, blanchedalmond, blue,
            blueviolet, brown, burlywood, cadetblue,
            chartreuse, chocolate, coral, cornflowerblue,
            cornsilk, crimson, cyan, darkblue, darkcyan,
            darkgoldenrod, darkgray, darkgrey, darkgreen,
            darkkhaki, darkmagenta, darkolivegreen, darkorange,
            darkorchid, darkred, darksalmon, darkseagreen,
            darkslateblue, darkslategray, darkslategrey,
            darkturquoise, darkviolet, deeppink, deepskyblue,
            dimgray, dimgrey, dodgerblue, firebrick,
            floralwhite, forestgreen, fuchsia, gainsboro,
            ghostwhite, gold, goldenrod, gray, grey, green,
            greenyellow, honeydew, hotpink, indianred, indigo,
            ivory, khaki, lavender, lavenderblush, lawngreen,
            lemonchiffon, lightblue, lightcoral, lightcyan,
            lightgoldenrodyellow, lightgray, lightgrey,
            lightgreen, lightpink, lightsalmon, lightseagreen,
            lightskyblue, lightslategray, lightslategrey,
            lightsteelblue, lightyellow, lime, limegreen,
            linen, magenta, maroon, mediumaquamarine,
            mediumblue, mediumorchid, mediumpurple,
            mediumseagreen, mediumslateblue, mediumspringgreen,
            mediumturquoise, mediumvioletred, midnightblue,
            mintcream, mistyrose, moccasin, navajowhite, navy,
            oldlace, olive, olivedrab, orange, orangered,
            orchid, palegoldenrod, palegreen, paleturquoise,
            palevioletred, papayawhip, peachpuff, peru, pink,
            plum, powderblue, purple, red, rosybrown,
            royalblue, rebeccapurple, saddlebrown, salmon,
            sandybrown, seagreen, seashell, sienna, silver,
            skyblue, slateblue, slategray, slategrey, snow,
            springgreen, steelblue, tan, teal, thistle, tomato,
            turquoise, violet, wheat, white, whitesmoke,
            yellow, yellowgreen
      - A number that will be interpreted as a color
        according to histogram.marker.colorscale
      - A list or array of any of the above
iAnanich commented 8 months ago

Can confirm.

Here's simple example to see difference:


from plotly import express as px
import pandas as pd

df = pd.DataFrame({"x": [1, 2], "y": [3,4]})

# works
px.density_heatmap(df)

# works
px.density_heatmap(df, color_continuous_scale="Viridis")

# works
px.density_heatmap(df, marginal_y="histogram", marginal_x="histogram")

# raises error
px.density_heatmap(df, marginal_y="histogram", marginal_x="histogram", color_continuous_scale="Viridis")
Coding-with-Adam commented 8 months ago

I can confirm the same error message.

from plotly import express as px
import pandas as pd
dff = px.data.tips()
fig = px.density_heatmap(dff, x="total_bill", y="tip", marginal_x="histogram")
fig.show()

error_fig = px.density_heatmap(dff, x="total_bill", y="tip", marginal_x="histogram", color_continuous_scale="Viridis")
error_fig.show()

It's as if it's trying to apply to the color scale to the marginal histogram as well...

I'll mark this as a bug for now, and we'll see what our engineers think once they have more time to take a deeper look.

empet commented 8 months ago

To understand what happens we have to study the "anatomy" of this fig:

fig=px.density_heatmap(df, marginal_y="histogram", marginal_x="histogram")

Let us inspect:

fig.data:
(Histogram2d({
     'coloraxis': 'coloraxis',
     'hovertemplate': 'index=%{x}<br>value=%{y}<br>count=%{z}<extra></extra>',
     'name': '',
     'x': array([0, 1, 0, 1], dtype=int64),
     'xaxis': 'x',
     'xbingroup': 'x',
     'y': array([1, 2, 3, 4], dtype=int64),
     'yaxis': 'y',
     'ybingroup': 'y'
 }),
 Histogram({
     'alignmentgroup': 'True',
     'bingroup': 'x',
     'hovertemplate': 'index=%{x}<br>count=%{y}<extra></extra>',
     'legendgroup': '',
     'marker': {'color': '#0d0887'},
     'name': '',
     'offsetgroup': '',
     'opacity': 0.5,
     'showlegend': False,
     'x': array([0, 1, 0, 1], dtype=int64),
     'xaxis': 'x3',
     'yaxis': 'y3'
 }),
 Histogram({
     'alignmentgroup': 'True',
     'bingroup': 'y',
     'hovertemplate': 'value=%{y}<br>count=%{x}<extra></extra>',
     'legendgroup': '',
     'marker': {'color': '#0d0887'},
     'name': '',
     'offsetgroup': '',
     'opacity': 0.5,
     'showlegend': False,
     'xaxis': 'x2',
     'y': array([1, 2, 3, 4], dtype=int64),
     'yaxis': 'y2'
 }))

and

fig.layout.coloraxis
layout.Coloraxis({
    'colorbar': {'title': {'text': 'count'}},
    'colorscale': [[0.0, '#0d0887'], [0.1111111111111111, '#46039f'],
                   [0.2222222222222222, '#7201a8'], [0.3333333333333333,
                   '#9c179e'], [0.4444444444444444, '#bd3786'],
                   [0.5555555555555556, '#d8576b'], [0.6666666666666666,
                   '#ed7953'], [0.7777777777777778, '#fb9f3a'],
                   [0.8888888888888888, '#fdca26'], [1.0, '#f0f921']]
})

What we see? The values in the Histogram2d are colormapped to the default colorscale, plasma, set via layout.coloraxis.colorscale, while the two marginal histograms are colored with the first color, '#0d0887', in the plasma colorscale, and the opacity 0.5. Hence if you want to change the colorscale to "Viridis", you may update it for the histogram2d as follows: fig.update_layout(coloraxis_colorscale='Viridis'), but you cannot access the color corresponding to 0.0 (the first color) in the viridis colorscale, to set is as a color for marginal histograms. As a conclusion, I think that there is no bug, and this type of plot was designed only for plasma colorscale, i.e. with the actual default settings, because setting the color for marginal distribution requires a non-standard operation . Looking at the https://github.com/plotly/plotly.py/blob/master/packages/python/plotly/plotly/express/colors/init.py we realize that using a particular colorscale would involve many searches to access the first color in that colorscale.

iAnanich commented 8 months ago

I don't think that the intention here was to make the marginal histogram of the colorscale you selected for the entire heatmap, but rather to just add a marginal histogram, in addition to main focus - the heatmap. Meaning, I would be fine if the marginal histogram remains in whatever colorscale as long as it is there and the heatmap follows the desired colormap. I was using heatmaps for a long time quite fine with custom colormap and didn't care that marginal histogram has a single color (aka colorscale was applied only to the heatmap, but not to marginal histogram) (btw, in plotly_dark they look terrible - hard to distinct). I hope the intention is clear: being able to use heatmpas with colorscales while also adding marginal heatmaps (with or without colorscale) - it's not to have both heatmap and histogram in single colorscale.

I was going to try a workaround with converting Viridis to 2d list, but i see that it might not happen so easily...

empet commented 8 months ago

I explained how this kind of px. density_heatmap is designed. You can change the default settings performing trace updates ( for the color of the marginal histograms), respectively, layout update: fig.update_layout(coloraxis_colorscale='NewcolorscaleName')

iAnanich commented 8 months ago

Example that still works:

# custom color scale definition
cscale = [
    [0, '#FFFF66'],
    [0.05, '#FFFF66'],
    [0.7, '#ff0000'],
    [1, '#ff0000']
]

# no error
px.density_heatmap(df, marginal_y="histogram", marginal_x="histogram", color_continuous_scale=cscale)

What is strange to me is that I can set a custom colorscale, and it still works. And histogram won't try to be colorful in this case.

Judging by the fact that this issue is brought up only now (I have some vague memory of this same error from a few months ago) - people are not using these 2 features together.

I appreciate workaround you provided, but I don't see OP intention to apply colorscale to marginal histogram, nor do I want it - it's simply that using colorscale by name for heatmap with marginal histogram (regardless of colors of the marginal histogram) is not working is surprising and the resulting error is poorly communicated.

I suggest letting the marginal histogram ignore the color_continuous_scale param from density_heatmap and similr figure factories, because it is not like that histogram will benefit much from being colorful. Maybe it would be nice to get a separate way of making that histogram obey some custom colorscale, but seems like it's not been requested yet.

iAnanich commented 8 months ago

This code snippet from ChatGPT3.5 is not raising error:

import numpy as np
import plotly.colors as pc

# Sample values (from 0 to 1)
values = np.linspace(0, 1, 10)  # Replace with your values

# Get the Viridis colorscale
viridis_colorscale = pc.sequential.Viridis

# Map values to colors based on the Viridis colorscale
colors_for_values = [[value, viridis_colorscale[int(value * (len(viridis_colorscale) - 1))]] for value in values]

# does not raise error
px.density_heatmap(df, marginal_y="histogram", marginal_x="histogram", color_continuous_scale=colors_for_values)
empet commented 8 months ago

The solution given by ChatGPT is a workaround that redefines an existing colorscale to give access to its first color. You can avoid this redefinition, as I said in the previous post, by updates:

fig = px.density_heatmap(df, marginal_y="histogram", marginal_x="histogram")
fig.update_layout(coloraxis_colorscale="Viridis")
fig.update_traces(marker_color=pc.sequential.Viridis[0], selector=dict(type="histogram"))
iAnanich commented 7 months ago

I see the issue doesn't progress...

Could my suggestion be implemented?

I suggest letting the marginal histogram ignore the color_continuous_scale param from density_heatmap and similar figure factories...

gvwilson commented 1 month ago

Hi - we are tidying up stale issues and PRs in Plotly's public repositories so that we can focus on things that are still important to our community. Since this one has been sitting for a while, I'm going to close it; if you'd like to submit a PR, we'd be happy to prioritize a review, and if it's a request for tech support, please post in our community forum. Thank you - @gvwilson