zillow / luminaire

Luminaire is a python package that provides ML driven solutions for monitoring time series data.
https://zillow.github.io/luminaire
Apache License 2.0
764 stars 59 forks source link

Top 5% volatility filtering location DataExploration #130

Open vincent1in opened 1 year ago

vincent1in commented 1 year ago

Code in exploration/data_exploration.py

def _shift_intensity(self, change_points=None, df=None, metric=None):
        """
        This function computes the Kullback_Leibler divergence of the the time series around a changepoint detected by the
        pelt_change_point_detection() function. This considers Gaussian assumption on the underlying data distribution.

        :param list change_points: A list storing indices of the potential change points
        :param pandas.dataframe df: A pandas dataframe containing time series ignoring the top 5% volatility
        :param str metric: A string in the dataframe column names that contains the time series
        :return: A list containing the magnitude of changes for every corresponding change points
        :rtype: list
        """

Question After looking through the code, I was wondering where the top 5% volatility dropped? It doesn't look like it's filtered anywhere before it.

Thank you for the help!

sayanchk commented 1 year ago

@vincentlin2 This is a documentation error. Thanks for catching!

We had a logic of detecting changepoints after removing the volatilities beyond p95 which we later removed.