I was trying to run N-HiTS with my own data using the shared colab
I tried to normalize the original EETm2 dataset and compared it with the data used in your N-HiTS model.
The size of df_train is 46641, and I followed the information given in section 4.1: Each set is normalized with the train data mean and standard deviation.
def normalize(df_csv, df_train):
result = df_csv.copy()
columns_names = list(df_csv.columns)
for feature_name in columns_names[1:]:
result[feature_name] = (df_csv[feature_name] - df_train[feature_name].mean()) / df_train[feature_name].std()
return result
My function return different result comparing to yours:
date HUFL
2016-07-01 00:00:00 0.126520
2016-07-01 00:15:00 -0.023339
2016-07-01 00:30:00 -0.098268
2016-07-01 00:45:00 -0.431177
2016-07-01 01:00:00 -0.231432
Name: HUFL, dtype: float64
Hello,
I was trying to run N-HiTS with my own data using the shared colab
I tried to normalize the original EETm2 dataset and compared it with the data used in your N-HiTS model.
The size of df_train is 46641, and I followed the information given in section 4.1:
Each set is normalized with the train data mean and standard deviation.
def normalize(df_csv, df_train): result = df_csv.copy() columns_names = list(df_csv.columns) for feature_name in columns_names[1:]: result[feature_name] = (df_csv[feature_name] - df_train[feature_name].mean()) / df_train[feature_name].std() return result
My function return different result comparing to yours: date HUFL 2016-07-01 00:00:00 0.126520 2016-07-01 00:15:00 -0.023339 2016-07-01 00:30:00 -0.098268 2016-07-01 00:45:00 -0.431177 2016-07-01 01:00:00 -0.231432 Name: HUFL, dtype: float64
and yours: unique_id | ds | y HUFL | 2016-07-01 00:00:00 | -0.041413 HUFL | 2016-07-01 00:15:00 | -0.185467 HUFL | 2016-07-01 00:30:00 | -0.257495 HUFL | 2016-07-01 00:45:00 | -0.577510 HUFL | 2016-07-01 01:00:00 | -0.385501
Can you please tell me more about the data normalization process?
Thanks and regards,
Sophie