How to scale cluster centers back in the original scale

tslearn-team / tslearn

The machine learning toolkit for time series analysis in Python

BSD 2-Clause "Simplified" License

2.91k stars 342 forks source link

Hello @praveenpiisc, Let's recall the transformation performed when a time series is normalized using:

scaler = TimeSeriesScalerMeanVariance()
X_normalized = scaler.fit_transform(X)

This transformation is:

mean_t = np.nanmean(X_, axis=1, keepdims=True)
std_t = np.nanstd(X_, axis=1, keepdims=True)
std_t[std_t == 0.] = 1.
X_ = (X_ - mean_t) * self.std / std_t + self.mu

See: https://github.com/tslearn-team/tslearn/blob/9937946/tslearn/preprocessing/preprocessing.py#L204-L298 This transformation "Scales time series so that their mean (resp. standard deviation) in each dimension is mu (resp. std)."

In this transformation, the mean and standard deviation are computed along axis=1 which corresponds to the length of the time series. There are therefore a mean and a standard deviation computed for each time series and dimension.

Therefore, there is no global mean and standard deviation that could be used to rescale the cluster centers since there is no unique "original scale".

tslearn-team / tslearn

How to scale cluster centers back in the original scale #512