predict-idlab / plotly-resampler

Visualize large time series data with plotly.py
https://predict-idlab.github.io/plotly-resampler/latest
MIT License
990 stars 67 forks source link

[BUG] Hover data does not match the displayed resampled point in scatter plot #270

Closed aasmune closed 8 months ago

aasmune commented 8 months ago

Describe the bug :crayon: When using register_plotly_resampler, hover data specified through hover_data argument to px.scatter sometimes shows up with incorrect values when hovering over a point with the mouse. The correct value is displayed when not using register_plotly_resampler.

Reproducing the bug :mag:

import plotly.express as px
from plotly_resampler import register_plotly_resampler

import pandas as pd
import numpy as np

labels = list(range(0, 3))
np.random.seed(0xdeadbeef)
x = np.random.normal(size=100000)
y = np.random.normal(size=100000)
label = np.random.randint(low=labels[0], high=labels[-1]+1, size=100000).astype(str)
description = np.random.randint(low=3, high=5, size=100000)

df = pd.DataFrame.from_dict({"x": x, "y": y, "label": label, "description": description})

x_label = 'x'
y_label = 'y'
label_label = "label"
df = df.sort_values(by=[x_label])

print("Highlighted point on screenshot:")
print(df[np.isclose(df[x_label], 3.864907)])

# Without resampler, shows correct hover data
fig = px.scatter(df, x=x_label, y=y_label, color=label_label, title=f"Without resampler", hover_data=["description"])
fig.show()

# With resampler, shows incorrect hover data
register_plotly_resampler(mode="auto", default_n_shown_samples=10000)
fig2 = px.scatter(df, x=x_label, y=y_label, color=label_label, title=f"With resampler", hover_data=["description"])
fig2.show()

Printing the highlighted point in the screenshots:

Highlighted point on screenshot:
              x         y label  description
51820  3.864907  0.705485     0            3

Expected behavior :wrench: I expect the hover data to be correct for the displayed resampled point.

Screenshots :camera_flash: Without resampler - shows correct hover data without_resampler

With resampler - shows incorrect hover data with_resampler

Environment information:

jonasvdd commented 8 months ago

Hi @aasmune,

Thank you for submitting this bug, I will look into this somewhere later this week and come back to you!

p.s: The aggregators that plotly-resampler employs are designed for selecting datapoints from consecutive/sequential/temporal data (which often boils down to selecting vertical extrema as can be seen in your resampled plot!). For x vs. y scatters, there may be other dimensionality reduction techniques that may be more suitable.

Kind regards, Jonas