Open TheaperDeng opened 3 years ago
add YEAR
feature in gen_dt_feature
.
remove the quote in result column names of gen_dt_feature
.
e.g. MONTH(StartTime) -> MONTH
When non_pd_datetime appears, impute("linear") will cause the following error Cannot interpolate with all object-dtype columns in the DataFrame. Try setting at least one column to a numeric dtype
def get_multi_id_ts_df():
return train_df.astype('object')
tsdata= TSDataset.from_pandas(df, target_col="value", dt_col="datetime",extra_feature_col=['extra feature'])
tsdata.impute("linear")
df = pd.DataFrame({"datetime":np.arange(100),
"id":np.array(['00']*100),
"value":np.random.randn(100),
"extra feature":np.random.randn(100)})
def not_aligned():
df_val = pd.DataFrame({"id":np.array(['00']*20+['01']*30+['02']*50),
"value":np.random.randn(100),
"extra feature":np.random.randn(100)})
data_sec = pd.DataFrame({"datetime": pd.date_range(start='1/1/2019 00:00:00',periods=20,freq='S')})
data_min = pd.DataFrame({"datetime": pd.date_range(start='1/2/2019 00:00:00',periods=30,freq='H')})
data_hou = pd.DataFrame({"datetime": pd.date_range(start='1/3/2019 00:00:00',periods=50,freq='D')})
dt_val = pd.concat([data_sec,data_min,data_hou],axis=0,ignore_index=True)
df = pd.merge(left=dt_val,right=df_val,left_index=True,right_index=True)
return df
When calling scale(scaler, fit=False)
multiple times, it should behave like calling it only once.
Since it's effective only once when fit=True
.
df = pd.DataFrame({"datetime": np.array(['1/1/2019', '1/2/2019']),
"value": np.array([1, 2])})
df_test = pd.DataFrame({"datetime": np.array(['1/3/2019', '1/4/2019']),
"value": np.array([1, 2])})
tsdata = TSDataset.from_pandas(df,
dt_col="datetime",
target_col="value")
tsdata_test = TSDataset.from_pandas(df_test,
dt_col="datetime",
target_col="value")
standard_scaler = StandardScaler()
tsdata.scale(standard_scaler, fit=True)
tsdata_test.scale(standard_scaler, fit=False).scale(standard_scaler, fit=False)
print(tsdata_test.df)
The expected output value column is [-1, 1]
, currently it is [-5, -1]
Test tsdata random call, there will be the following three types of errors.(use get_multip_df)
-
operator, is not supported, use the bitwise_xor, the ^
operator, or the logical_xor function instead.In utils/feature.py
, function _is_weekend()
:
the line return (weekday >= 5).values
should be changed to return (weekday >= 5).astype(int).values
gen_dt_feature
.