predict-idlab / plotly-resampler

Visualize large time series data with plotly.py
https://predict-idlab.github.io/plotly-resampler/latest
MIT License
1k stars 67 forks source link

Log support #207

Closed jonasvdd closed 1 year ago

jonasvdd commented 1 year ago

Partially addresses #206 Closes #190

Specifically, log-axis their zoom functionality will be supported.

TODO:

jonasvdd commented 1 year ago

Code ⬇️ implementation which projects LTTB its buckets to equidistant log buckets.

import numpy as np
from plotly_resampler.aggregation.aggregation_interface import DataPointSelector
from typing import Union

class LogLTTB(DataPointSelector):
    @staticmethod
    def _argmax_area(prev_x, prev_y, avg_next_x, avg_next_y, x_bucket, y_bucket) -> int:
        """Vectorized triangular area argmax computation.

        Parameters
        ----------
        prev_x : float
            The previous selected point is x value.
        prev_y : float
            The previous selected point its y value.
        avg_next_x : float
            The x mean of the next bucket
        avg_next_y : float
            The y mean of the next bucket
        x_bucket : np.ndarray
            All x values in the bucket
        y_bucket : np.ndarray
            All y values in the bucket

        Returns
        -------
        int
            The index of the point with the largest triangular area.
        """
        return np.abs(
            x_bucket * (prev_y - avg_next_y)
            + y_bucket * (avg_next_x - prev_x)
            + (prev_x * avg_next_y - avg_next_x * prev_y)
        ).argmax()

    def _arg_downsample(
        self, x: Union[np.ndarray, None], y: np.ndarray, n_out: int, **kwargs
    ) -> np.ndarray:
        """TODO complete docs"""
        # We need a valid x array to determing the x-range
        assert x is not None, "x cannot be None for this downsampler"

        # the log function to use
        lf = np.log1p

        offset = np.unique(
            np.searchsorted(
                x, np.exp(np.linspace(lf(x[0]), lf(x[-1]), n_out + 1)).astype(np.int64)
            )
        )

        # Construct the output array
        sampled_x = np.empty(len(offset) + 1, dtype="int64")
        sampled_x[0] = 0
        sampled_x[-1] = x.shape[0] - 1

        # Convert x & y to int if it is boolean
        if x.dtype == np.bool_:
            x = x.astype(np.int8)
        if y.dtype == np.bool_:
            y = y.astype(np.int8)

        a = 0
        for i in range(len(offset) - 2):
            a = (
                self._argmax_area(
                    prev_x=x[a],
                    prev_y=y[a],
                    avg_next_x=np.mean(x[offset[i + 1] : offset[i + 2]]),
                    avg_next_y=y[offset[i + 1] : offset[i + 2]].mean(),
                    x_bucket=x[offset[i] : offset[i + 1]],
                    y_bucket=y[offset[i] : offset[i + 1]],
                )
                + offset[i]
            )
            sampled_x[i + 1] = a

        # ------------ EDGE CASE ------------
        # next-average of last bucket = last point
        sampled_x[-2] = (
            self._argmax_area(
                prev_x=x[a],
                prev_y=y[a],
                avg_next_x=x[-1],  # last point
                avg_next_y=y[-1],
                x_bucket=x[offset[-2] : offset[-1]],
                y_bucket=y[offset[-2] : offset[-1]],
            )
            + offset[-2]
        )
        return sampled_x

Code ⬇️ example which creates a log plot:

from plotly_resampler.aggregation import NoGapHandler

n = 100_000
y = np.sin(np.arange(n) / 2_000) + np.random.randn(n) / 10

fr = FigureResampler(False)
fr.add_trace(
    go.Scattergl(mode="lines+markers", marker_color=np.abs(y) / np.max(np.abs(y))),
    hf_x=np.arange(n),
    hf_y=y,
    downsampler=LogLTTB(),
    gap_handler=NoGapHandler(),
)
fr.update_xaxes(type="log")
fr.update_layout(template='plotly_white', title='log axis demo')

image

codecov-commenter commented 1 year ago

Codecov Report

Merging #207 (68ab1a9) into main (a14331e) will increase coverage by 0.01%. The diff coverage is 100.00%.

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main     #207      +/-   ##
==========================================
+ Coverage   97.15%   97.16%   +0.01%     
==========================================
  Files          13       13              
  Lines         985      989       +4     
==========================================
+ Hits          957      961       +4     
  Misses         28       28              
Impacted Files Coverage Δ
..._resampler/aggregation/plotly_aggregator_parser.py 94.56% <100.00%> (+0.12%) :arrow_up:
...ler/figure_resampler/figure_resampler_interface.py 100.00% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

jvdd commented 1 year ago

LGTM!