wjm41 / molplotly

add-on to plotly which show molecule images on mouseover!
Apache License 2.0
242 stars 25 forks source link

Defining color and markers simultaneously in px.scatter causes issues with hoverbox #6

Closed chertianser closed 2 years ago

chertianser commented 2 years ago

Hi there, thanks for providing a great and easy to use tool!

This issue is reproducible with the first example in the documentation:

df_esol['delY'] = df_esol["y_pred"] - df_esol["y_true"]
fig_scatter = px.scatter(df_esol,
                         x="y_true",
                         y="y_pred",
                         color='delY',
                         marker='Minimum Degree', # <- addition
                         title='ESOL Regression (default plotly)',
                         labels={'y_pred': 'Predicted Solubility',
                                 'y_true': 'Measured Solubility',
                                 'delY': 'ΔY'},
                         width=1200,
                         height=800)

# This adds a dashed line for what a perfect model _should_ predict
y = df_esol["y_true"].values
fig_scatter.add_shape(
    type="line", line=dict(dash='dash'),
    x0=y.min(), y0=y.min(),
    x1=y.max(), y1=y.max()
)

fig_scatter.update_layout(title='ESOL Regression (with add_molecules!)')

app_scatter = molplotly.add_molecules(fig=fig_scatter,
                                      df=df_esol,
                                      smiles_col='smiles',
                                      title_col='Compound ID',
                                      color_col='delY' # <- addition
                                      )

# change the arguments here to run the dash app on an external server and/or change the size of the app!
app_scatter.run_server(mode='inline', port=8001, height=1000)

This returns

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
~/anaconda3/envs/ml/lib/python3.7/site-packages/molplotly/main.py in display_hover(
    hoverData={'points': [{'bbox': {'x0': 948.39, 'x1': 950.39, 'y0': 177.7, 'y1': 179.7}, 'curveNumber': 0, 'marker.color': -0.48000000000000004, 'pointIndex': 960, 'pointNumber': 960, 'x': 0.79, 'y': 0.31}]}
)
    111             df_curve = df[df[color_col] ==
    112                           curve_dict[curve_num]].reset_index(drop=True)
--> 113             df_row = df_curve.iloc[num]
        df_row = undefined
        df_curve.iloc = <pandas.core.indexing._iLocIndexer object at 0x7f7e3d16c950>
        num = 960
    114         else:
    115             df_row = df.iloc[num]

~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(
    self=<pandas.core.indexing._iLocIndexer object>,
    key=960
)
    929 
    930             maybe_callable = com.apply_if_callable(key, self.obj)
--> 931             return self._getitem_axis(maybe_callable, axis=axis)
        self._getitem_axis = <bound method _iLocIndexer._getitem_axis of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
        maybe_callable = 960
        axis = 0
    932 
    933     def _is_scalar_access(self, key: tuple):

~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(
    self=<pandas.core.indexing._iLocIndexer object>,
    key=960,
    axis=0
)
   1564 
   1565             # validate the location
-> 1566             self._validate_integer(key, axis)
        self._validate_integer = <bound method _iLocIndexer._validate_integer of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
        key = 960
        axis = 0
   1567 
   1568             return self.obj._ixs(key, axis=axis)

~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(
    self=<pandas.core.indexing._iLocIndexer object>,
    key=960,
    axis=0
)
   1498         len_axis = len(self.obj._get_axis(axis))
   1499         if key >= len_axis or key < -len_axis:
-> 1500             raise IndexError("single positional indexer is out-of-bounds")
        global IndexError = undefined
   1501 
   1502     # -------------------------------------------------------------------

IndexError: single positional indexer is out-of-bounds

Using either only marker or color alone causes no issues with the hoverbox. Also, using Minimum Degree as color_col for add_molecules when both color and symbol are defined gives no issues.

wjm41 commented 2 years ago

I seem to have found the issue - when marker or symbol is passed in to the scatterplot plotly splits the data into three separate scatterplots each with a different symbol shape which is why it breaks the indexing - I'll implement a fix alongside some new features hopefully this weekend! :)

wjm41 commented 2 years ago

The issue has been fixed in merge #9 - I will make a new release alongside other features once I've updated the docs :)