Open francoisailah opened 2 months ago
If you make use of the get_datasets
function here: https://github.com/ibm-granite/granite-tsfm/blob/main/tsfm_public/toolkit/time_series_preprocessor.py#L806
This is the process assuming the preprocessor was instantiated with scaling=True
:
split_config
.time_series_preprocessor.train(...)
time_series_preprocessor.preprocess(...)
context_length
followed by windows of prediction_length
which then become the past and future value tensors.It seems that the test_data is normalized based on the mean and standard deviation of the train_data, which makes sense. Thank you,
Hello,
In grp[self.target_columns] = self.target_scaler_dict[name].transform(grp[self.target_columns]) Where can I find the transform method?
target_scaler_dict
is a dictionary of scalers -- they will be one of standard scaler or minmax scaler from sklearn.
In this case, looking at the following codes, the test_data (train_data, valid_data) is normalized by its min and max (or mu and std). Then, the data is split into past_values, future_values, etc. Is this correct?
train_valid_test = [train_data, valid_data, test_data] train_valid_test_prep = [ts_preprocessor.preprocess(d) for d in train_valid_test]
Thank you
The preprocess
method is called on each of train, valid, test. So it happens after splitting.
In this case, normalizing the entire dataset could introduce look-ahead bias. Let's assume that the vector X is normalized by its mean mu_X and std_X, and then we split X into past values X1 and future values X2 (X = concat([X1, X2])), and the model aims to forecast X2 based on X1. After normalization, mu_X = 0 and mu_X = w1mu_X1 + w2mu_X2 (w1 = n1/(n1 + n2), w2 =n2/(n1+n2), n1=len(X1) and n2=len(X2)), then *mu_X2 = (-w1/w2) mu_X1** , that means that the model has some information about the unseen data X2.
I am looking at the codes, it seems that the scaler is fitting on the train data, which means that the test_data is normalized using the mean and std of the train data (transform). In this case, there is no look-ahead bias.
Thank you,
Hi @francoisailah a couple points:
Hello,
During time series normalization, are all data points within a channel (X) used to calculate the mean and standard deviation for normalization? Subsequently, is the normalized data split into past and future values for modeling?
Alternatively, is a forward normalization technique employed, where statistics are calculated on a rolling window basis for each prediction?
Thank you,