PaddlePaddle / PaddleTS

Awesome Easy-to-Use Deep Time Series Modeling based on PaddlePaddle, including comprehensive functionality modules like TSDataset, Analysis, Transform, Models, AutoTS, and Ensemble, etc., supporting versatile tasks like time series forecasting, representation learning, and anomaly detection, etc., featured with quick tracking of SOTA deep models.
Apache License 2.0
479 stars 116 forks source link

经过带已知协变量训练后,模型预测却失效的问题。 #451

Closed suntao2015005848 closed 6 months ago

suntao2015005848 commented 12 months ago

在LSTNetRegressor训练的时序模型中,经过带已知协变量训练后,模型预测却失效的问题 版本信息: PaddlePaddle: 2.3.2.post112 paddleTs: 1.1.0 模型训练代码:

import pandas as pd

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler, MinMaxScaler
import datetime
import paddlets
from paddlets import TSDataset
from paddlets import TimeSeries
from paddlets.models.forecasting.dl import * #引入了全部预测模型
from paddlets.models.forecasting import * #引入了全部预测模型
from paddlets.transform import Fill, StandardScaler
from paddlets.metrics import MSE, MAE
import warnings
warnings.filterwarnings('ignore')
from paddlets.automl.autots import AutoTS
import os 
from paddlets.automl.autots import SearchSpaceConfiger
from ray.tune import uniform, qrandint, choice,quniform
from paddlets.transform import TimeFeatureGenerator

# 读取CSV文件
df = pd.read_csv('/home/aistudio/ydl/fh_power_data_expend.csv')
df = df.filter(items=['monitorTime', 'presentValue'])
print(len(df))

target_cov_dataset = TSDataset.load_from_dataframe(
    df,
    time_col='monitorTime',
    target_cols='presentValue',
    freq='5min',
    fill_missing_dates=True,
    fillna_method='pre'
)    
# 是否是工作日
time_feature_generator = TimeFeatureGenerator(feature_cols=['is_workday'])
target_cov_dataset = time_feature_generator.fit_transform(target_cov_dataset)

train_dataset, val_test_dataset = target_cov_dataset.split(0.8)
val_dataset, test_dataset = val_test_dataset.split(0.5)
train_dataset.plot(add_data=[val_dataset,test_dataset], labels=['Val', 'Test'])

lstm = LSTNetRegressor(
    in_chunk_len = 288,
    out_chunk_len = 6,
    max_epochs=3
)
lstm.fit(train_dataset, val_dataset) 

lstm.save(path="/home/aistudio/ydl/mode_fh_expend/lsnet")

模型预测的测试代码:

df = pd.read_csv('/home/aistudio/ydl/test.csv')
df = df.filter(items=['monitorTime', 'presentValue'])

target_cov_dataset = TSDataset.load_from_dataframe(
    df,
    time_col='monitorTime',
    target_cols='presentValue',
    freq='5min',
    fill_missing_dates=True,
    fillna_method='pre'
)    
# print('没有任何协变量:-------------------------------------------------------->')
# print(target_cov_dataset.get_all_cov())

# 是否是工作日
time_feature_generator = TimeFeatureGenerator(feature_cols=['is_workday'])
target_cov_dataset = time_feature_generator.fit_transform(target_cov_dataset)
print('加入时间协变量:---------------------------------------------------------->')
print(target_cov_dataset.get_all_cov())

mod = LSTNetRegressor.load(path="/home/aistudio/ydl/mode_fh_expend/lsnet")
da1 ,da2 = target_cov_dataset.split('2023-09-11 00:05:00')
print('预测数据:---------------------------------------------------------->')
print(da1)
res = mod.predict(da1)
print('不加协变量预测结果:---------------------------------------------------------->')
print(res)

LSTNetRegressor模型参数: 116e8f0fa652452133668215542321a 有is_workday已知协变量的预测结果: image 无is_workday已知协变量的预测结果: image 两次预测结果在有协变量和无限量的情况下相同。LSTNetRegressor协变量对模型预测并未产生影响是什么问题? 而在测试NBEATSModel模型时候,协变量的变化会影响模型输出的结果。 NBEATSModel模型参数: cf25f4f514b45771cbfce4e6e5ccba3

Sunting78 commented 12 months ago

LSTNetRegressor模型是不支持协变量的

suntao2015005848 commented 12 months ago

@Sunting78 其他模型是否支持协变量,有相关的关于内置模型的文档描述吗?我看ts接口文档中并未有过多描述。

Sunting78 commented 6 months ago

您好,例如RNN NBeats 算法支持协助变量。具体可以查看模型网络结构代码。