plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
16.26k stars 2.55k forks source link

px.scatter() errors when size param is Pandas Extension dtype Int64 #3071

Open CodeCox opened 3 years ago

CodeCox commented 3 years ago

Description

px.scatter() errors when size param is Pandas extension dtype Int64. However, size param works as expected with standard/numpy int64.

The new Pandas Extension dtypes should be supported in Plotly Express eg. I use the pandas.convert_dtypes() util which changes int64 to Int64.

Example Code

df = pd.DataFrame({
        'myx': [1, 2, 3, , 5]
        'myy': [10, 20, 30, 40, 50],
        'mysize': [100, 200, 300, 400, 500]},
        dtype= 'Int64')

px.scatter(df, x='myx', y='myy', size='mysize')

Error

Traceback

``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) ~\Miniconda3\envs\jlab3\lib\site-packages\_plotly_utils\basevalidators.py in validate_coerce(self, v) 749 try: --> 750 v_array = copy_to_readonly_numpy_array(v, force_numeric=True) 751 except (ValueError, TypeError, OverflowError): ~\Miniconda3\envs\jlab3\lib\site-packages\_plotly_utils\basevalidators.py in copy_to_readonly_numpy_array(v, kind, force_numeric) 108 if is_numpy_convertable(v): --> 109 return copy_to_readonly_numpy_array( 110 np.array(v), kind=kind, force_numeric=force_numeric ~\Miniconda3\envs\jlab3\lib\site-packages\_plotly_utils\basevalidators.py in copy_to_readonly_numpy_array(v, kind, force_numeric) 137 if force_numeric and new_v.dtype.kind not in numeric_kinds: --> 138 raise ValueError( 139 "Input value is not numeric and" "force_numeric parameter set to True" ValueError: Input value is not numeric andforce_numeric parameter set to True During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) in 5 dtype= 'Int64') 6 ----> 7 px.scatter(df, x='myx', y='myy', size='mysize') ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\express\_chart_types.py in scatter(data_frame, x, y, color, symbol, size, hover_name, hover_data, custom_data, text, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, error_x, error_x_minus, error_y, error_y_minus, animation_frame, animation_group, category_orders, labels, orientation, color_discrete_sequence, color_discrete_map, color_continuous_scale, range_color, color_continuous_midpoint, symbol_sequence, symbol_map, opacity, size_max, marginal_x, marginal_y, trendline, trendline_color_override, log_x, log_y, range_x, range_y, render_mode, title, template, width, height) 62 mark in 2D space. 63 """ ---> 64 return make_figure(args=locals(), constructor=go.Scatter) 65 66 ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\express\_core.py in make_figure(args, constructor, trace_patch, layout_patch) 2017 args, trace_spec, group, mapping_labels.copy(), sizeref 2018 ) -> 2019 trace.update(patch) 2020 if fit_results is not None: 2021 trendline_rows.append(mapping_labels.copy()) ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\basedatatypes.py in update(self, dict1, overwrite, **kwargs) 5067 BaseFigure._perform_update(self, kwargs, overwrite=overwrite) 5068 else: -> 5069 BaseFigure._perform_update(self, dict1, overwrite=overwrite) 5070 BaseFigure._perform_update(self, kwargs, overwrite=overwrite) 5071 ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\basedatatypes.py in _perform_update(plotly_obj, update_obj, overwrite) 3883 # Update compound objects recursively 3884 # plotly_obj[key].update(val) -> 3885 BaseFigure._perform_update(plotly_obj[key], val) 3886 elif isinstance(validator, CompoundArrayValidator): 3887 if plotly_obj[key]: ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\basedatatypes.py in _perform_update(plotly_obj, update_obj, overwrite) 3904 else: 3905 # Assign non-compound value -> 3906 plotly_obj[key] = val 3907 3908 elif isinstance(plotly_obj, tuple): ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\basedatatypes.py in __setitem__(self, prop, value) 4802 # ### Handle simple property ### 4803 else: -> 4804 self._set_prop(prop, value) 4805 else: 4806 # Make sure properties dict is initialized ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\basedatatypes.py in _set_prop(self, prop, val) 5146 return 5147 else: -> 5148 raise err 5149 5150 # val is None ~\Miniconda3\envs\jlab3\lib\site-packages\plotly\basedatatypes.py in _set_prop(self, prop, val) 5141 5142 try: -> 5143 val = validator.validate_coerce(val) 5144 except ValueError as err: 5145 if self._skip_invalid: ~\Miniconda3\envs\jlab3\lib\site-packages\_plotly_utils\basevalidators.py in validate_coerce(self, v) 750 v_array = copy_to_readonly_numpy_array(v, force_numeric=True) 751 except (ValueError, TypeError, OverflowError): --> 752 self.raise_invalid_val(v) 753 754 # Check min/max ~\Miniconda3\envs\jlab3\lib\site-packages\_plotly_utils\basevalidators.py in raise_invalid_val(self, v, inds) 275 name += "[" + str(i) + "]" 276 --> 277 raise ValueError( 278 """ 279 Invalid value of type {typ} received for the '{name}' property of {pname} ValueError: Invalid value of type 'pandas.core.series.Series' received for the 'size' property of scatter.marker Received value: 0 100 1 200 2 300 3 400 4 500 Name: mysize, dtype: Int64 The 'size' property is a number and may be specified as: - An int or float in the interval [0, inf] - A tuple, list, or one-dimensional numpy array of the above ```

Versions

conda env

``` Windows 10 jupyterlab 3.0.7 python 3.8.6 pandas 1.2.1 plotly 4.14.3 ```

alexis-scopely commented 2 years ago

Any progress on this one? I am having the same error

johentsch commented 11 months ago

Hi everyone, this bug unfortunately persists in Plotly 5.18.0, nearly three years after it was first reported here. What makes it worse is that the error message is not helpful at all for discovering the root of the problem because it says accepted would be

A tuple, list, or one-dimensional numpy array of the above

If the community is not willing to fix this for some reason, at least they should make the users aware that nullable integers are not OK. They have been around for quite a while so it makes sense to take them into account -- ideally by accepting them as integers.

Fixing this might be a good occasion to also address related issues such as #3495

Coding-with-Adam commented 11 months ago

Thanks for following up on this issue, @johentsch. I just ran the code provided above and can confirm that it works when data type is a numpy int64: dtype='int64'

Let us look into this and see the best way to move forward.

cbpygit commented 9 months ago

This problem persists in January 2024.