TensorPulse commented 2 months ago

您好，作者，感谢提供如此完整的学习框架！本人在使用和移植基线的过程中遇到一些问题和不便的地方，在此提出来以便您参考优化。声明：以下问题和建议仅代表个人看法，仅供参考问题：利用pycham直接运行data_preparation显示找不到数据集文件，运行train时也一样，做如下修改就可以运行： OUTPUT_DIR = "../../../experiments/datasets/" + DATASET_NAME DATA_FILE_PATH = "../../../datasets/raw_data/{0}/{0}.npz".format(DATASET_NAME) GRAPH_FILE_PATH = "../../../datasets/rawdata/{0}/adj{0}".format(DATASET_NAME) DISTANCE_FILE_PATH = "../../../datasets/rawdata/{0}/distance{0}".format(DATASET_NAME) 优化建议： 1.数据集的归一化和反归一化：CFG.RESCALE：如果为True，表示既反归一化数据又将整个数据的标准化，如果为False，表示既不反归一化数据又将数据的每个通道标准化。可以拆解为两个变量，一个变量控制数据的标准化，一个变量控制数据和归一化和反归一化。

模型训练结果表示不清：用模型名+epochs的方式所表达的直接信息不全，可做如下修改： CFG.TRAIN.CKPT_SAVEDIR = os.path.join( "checkpoints", CFG.MODEL.NAME, "".join([CFG.DATASET_NAME, str(CFG.TRAIN.NUM_EPOCHS)]) ) 3.项目的可视化接口不足：可在tensorboard中增加一些指标或增加预测数据保存的接口 4.项目cfg文件中有许多隐藏接口，可以添加一个Simple_CFG将所有接口表达出来，例如： CFG.MODEL.SETUP_GRAPH = False CFG.TRAIN.FINETUNE_FROM CFG.RESCALE = True 5.在基线STGODE中，需要引入A_sp_hat, A_se_hat两个张量，发现即使将整个模型放入gpu中，这两个张量仍然存在于cpu中，直到后续.to（x.device）。这在模型的移植中不太便利，需要找到张量最终使用的地方。建议使用 from easytorch.device import get_device_type if get_device_type() == 'gpu': device = 'cuda' else: device = 'cpu' self.device = device 或者from easytorch.device import to_device
6.在test过程中没有进度条显示，可修改：

tqdm process bar

data_iter = tqdm(self.test_data_loader)

test loop

for iter_index, data in enumerate(data_iter):

zezhishao commented 2 months ago

非常感谢您的建议，这对我们帮助很大！我们一直计划整体升级一版代码，但一直苦于没有时间。再次感谢您的建议，我们后续会一一修改。若您有一些已经完成的修改，可以通过PR的方式合并到主目录，成为BasicTS的开发者～

zezhishao commented 2 months ago

"tensorboard中增加一些指标"，您指的是什么呢，可否给出一些具体的需求？

TensorPulse commented 2 months ago

比如模型的计算图，框架图，权重，偏差随时间变化的直方图。预测值，真实值，历史值的可视化，将嵌入投射到低维空间的可视化等等

TensorPulse commented 2 months ago

个人觉得basicts论文中对ETT系列数据集的标准化处理有一点问题，因为ETT数据集的各列数据不像PEMS数据集是同一单位，导致各列的值差异有时候差异较大，采用各个通道标准化可能比整体标准化更合适，若需要在计算指标时反归一化，可参考优化建议第一条

zezhishao commented 2 months ago

好的，感谢您的建议。我正在开发新版的BasicTS，基本已经完成，会在未来几天内发布。后续用户将可以在训练的时候即时指定归一化方式和是否反归一化，而无需提前预处理数据。

TensorPulse commented 2 months ago

作者您好，BasicTS能支持自动调参不？参考链接:https://zhuanlan.zhihu.com/p/401190615?utm_id=0 https://github.com/LibCity/Bigscity-LibCity

zezhishao commented 2 months ago

您好，下一版本没有囊括自动调参功能，我对自动调参这块不是很熟悉，从我自己的经验来看这几个数据集似乎对超参数没那么敏感？

huiguhean commented 2 months ago

请问一下，运行train找不到FileNotFoundError: [Errno 2] No such file or directory: 'datasets/ETTh1/scaler_in_96_out_336_rescale_True.pkl'，修改哪个文件夹

huiguhean commented 2 months ago

修改了baselines里面对应的模型数据集好了

zezhishao commented 2 months ago

您好，在目前的版本下，您还需要手动生成不同输入输出长度的数据集。您可以通过下面的指令生成：

python scripts/data_preparation/${DATASET_NAME}/generate_training_data.py --history_seq_len ${INPUT_LEN} --future_seq_len ${OUTPUT_LEN}

例如：

python scripts/data_preparation/ETTh1/generate_training_data.py  --history_seq_len 96 --future_seq_len 336

马上会更新一个版本，可以在训练的时候即时指定，敬请期待～

huiguhean commented 2 months ago

伟大，无需多言！

TensorPulse commented 2 months ago

作者您好，下个版本是否有支持单变量预测的接口？

zezhishao commented 2 months ago

您所说的单变量指的是什么？是指只有一条时间序列的数据集吗？

TensorPulse commented 2 months ago

不好意思，好像只需要重新定义runner即可。单变量预测指的是带OT变量的数据集，例如ETT

TensorPulse commented 2 months ago

作者您好，我想请教一下BasicTS论文中长时序预测结果的历史长度和预测长度，我看代码里给的不同模型的历史长度似乎存在不一样，预测长度应该都是336，所以我想确定一下

zezhishao commented 2 months ago

是的，事实上不同的论文对于历史长度的规定是不一样的，而不同方法的最优历史长度也不一样。我们的做法是在几个常用的历史长度中采用效果最好的那一个。

morestart commented 2 months ago

I noticed that easytorch is no longer maintained. Are you considering switching the backend in the new version?

zezhishao commented 2 months ago

Currently, EasyTorch is still able to meet the needs of BasicTS, so there won't be any changes in the short term. However, in the longer term, I hope that the backend of BasicTS will no longer need to rely on other packages, although this will be time-consuming.

Do you have any other needs that the current EasyTorch backend cannot satisfy?"

zezhishao commented 2 months ago

大家好，BasicTS代码已更新，欢迎大家查看并使用！

Hello, everyone! The BasicTS code has been updated. Feel free to check it out and use it!

TensorPulse commented 2 months ago

作者您好，关于PEMS序列数据集的图结构是否有方向性？我看代码里面默认是无向图，是否可以提供一个有向图，无向图的可选项？还是说数据集本身就是无向图？参考代码： i, j, distance = int(row[0]), int(row[1]), float(row[2]) adjacency_matrix_connectivity[id_dict[i], id_dict[j]] = 1 adjacency_matrix_distance[id_dict[i], id_dict[j]] = distance if not directed: adjacency_matrix_connectivity[id_dict[j], id_dict[i]] = 1 adjacency_matrix_distance[id_dict[j], id_dict[i]] = distance

zezhishao commented 2 months ago

您好，PEMS0X的数据集处理脚本是数据集自带的。你可以通过在github上搜索：A[id_dict[i], id_dict[j]] = 1，你可以找到很多仓库的实现代码，比如STGCN、ASTGCN、STSGCN、STFGCN等。

TensorPulse commented 1 month ago

作者您好，数据可视化的代码是否可以更新一下？

zezhishao commented 1 month ago

好的，忘记更新了，明天更新啊

TensorPulse commented 1 month ago

complete_config文件中CFG.DATASET.PARAM中的overlap参数未给出。关于重叠数据集的划分是否合理？以pems08数据集为例，在验证集与测试集比例相同的情况下，验证集比测试集的多了11条数据

zezhishao commented 1 month ago

您好，感谢您的建议，overlap参数目前已经设置默认为False并自动调整，且给出警告。默认情况下，overlap设置为False。当Train/Valid/Test数据对应的原始数据长度过短，无法形成足够样本的时候（例如Illeness数据集），会自动启用overlap，并给出提示。例如STID运行Illeness数据集时会产生如下log：

2024-09-13 10:27:07,383 - easytorch-launcher - INFO - Launching EasyTorch training.
DESCRIPTION: An Example Config
GPU_NUM: 1
RUNNER: <class 'basicts.runners.runner_zoo.simple_tsf_runner.SimpleTimeSeriesForecastingRunner'>
DATASET:
  NAME: Illness
  TYPE: <class 'basicts.data.simple_tsf_dataset.TimeSeriesForecastingDataset'>
  PARAM:
    dataset_name: Illness
    train_val_test_ratio: [0.7, 0.1, 0.2]
    input_len: 96
    output_len: 48
SCALER:
  TYPE: <class 'basicts.scaler.z_score_scaler.ZScoreScaler'>
  PARAM:
    dataset_name: Illness
    train_ratio: 0.7
    norm_each_channel: True
    rescale: False
MODEL:
  NAME: STID
  ARCH: <class 'baselines.STID.arch.stid_arch.STID'>
  PARAM:
    num_nodes: 7
    input_len: 96
    input_dim: 1
    embed_dim: 2048
    output_len: 48
    num_layer: 1
    if_node: True
    node_dim: 32
    if_T_i_D: True
    if_D_i_W: True
    temp_dim_tid: 8
    temp_dim_diw: 8
    time_of_day_size: 1
    day_of_week_size: 7
  FORWARD_FEATURES: [0, 1, 2]
  TARGET_FEATURES: [0]
METRICS:
  FUNCS:
    MAE: masked_mae
    MSE: masked_mse
  TARGET: MAE
  NULL_VAL: nan
TRAIN:
  NUM_EPOCHS: 100
  CKPT_SAVE_DIR: checkpoints/STID/Illness_100_96_48
  LOSS: masked_mae
  OPTIM:
    TYPE: Adam
    PARAM:
      lr: 0.0005
      weight_decay: 0.0005
  LR_SCHEDULER:
    TYPE: MultiStepLR
    PARAM:
      milestones: [1, 3, 5]
      gamma: 0.1
  CLIP_GRAD_PARAM:
    max_norm: 5.0
  DATA:
    BATCH_SIZE: 64
    SHUFFLE: True
VAL:
  INTERVAL: 1
  DATA:
    BATCH_SIZE: 64
TEST:
  INTERVAL: 1
  DATA:
    BATCH_SIZE: 64
EVAL:
  USE_GPU: True

2024-09-13 10:27:07,451 - easytorch-env - INFO - Use devices 0.
2024-09-13 10:27:07,506 - easytorch-launcher - INFO - Initializing runner "<class 'basicts.runners.runner_zoo.simple_tsf_runner.SimpleTimeSeriesForecastingRunner'>"
2024-09-13 10:27:07,506 - easytorch-env - INFO - Disable TF32 mode
2024-09-13 10:27:07,506 - easytorch - INFO - Set ckpt save dir: 'checkpoints/STID/Illness_100_96_48/9cd15181d2d202a278536bfd1f1031a0'
2024-09-13 10:27:07,506 - easytorch - INFO - Building model.
2024-09-13 10:27:07,747 - easytorch-training - INFO - Initializing training.
2024-09-13 10:27:07,747 - easytorch-training - INFO - Set clip grad, param: {'max_norm': 5.0}
2024-09-13 10:27:07,748 - easytorch-training - INFO - Building training data loader.
2024-09-13 10:27:07,748 - easytorch-training - INFO - Train dataset length: 534
2024-09-13 10:27:08,271 - easytorch-training - INFO - Set optim: Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: None
    lr: 0.0005
    maximize: False
    weight_decay: 0.0005
)
2024-09-13 10:27:08,271 - easytorch-training - INFO - Set lr_scheduler: <torch.optim.lr_scheduler.MultiStepLR object at 0x7faaa1792550>
2024-09-13 10:27:08,271 - easytorch-training - INFO - Loading Checkpoint from 'checkpoints/STID/Illness_100_96_48/9cd15181d2d202a278536bfd1f1031a0/STID_100.pt'
2024-09-13 10:27:08,343 - easytorch-training - INFO - Resume training
2024-09-13 10:27:08,344 - easytorch-training - INFO - Initializing validation.
2024-09-13 10:27:08,345 - easytorch-training - INFO - Building val data loader.
2024-09-13 10:27:08,345 - easytorch-training - INFO - Validation dataset is too short, enabling overlap. See details in /home/S22/workspace/BasicTS/basicts/data/simple_tsf_dataset.py at line 96.
2024-09-13 10:27:08,345 - easytorch-training - INFO - Validation dataset length: 96
2024-09-13 10:27:08,383 - easytorch-training - INFO - Test dataset length: 50
2024-09-13 10:27:08,384 - easytorch-training - INFO - Number of parameters: 9090224
2024-09-13 10:27:08,384 - easytorch-training - INFO - The training finished at 2024-09-13 10:27:08
2024-09-13 10:27:08,384 - easytorch-training - INFO - Evaluating the best model on the test set.
2024-09-13 10:27:08,384 - easytorch-training - INFO - Loading Checkpoint from 'checkpoints/STID/Illness_100_96_48/9cd15181d2d202a278536bfd1f1031a0/STID_best_val_MAE.pt'
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.40it/s]
2024-09-13 10:27:08,657 - easytorch-training - INFO - Result <test>: [test_time: 0.21 (s), test_MAE: 1.3040, test_MSE: 3.2762]
2024-09-13 10:27:08,658 - easytorch-training - INFO - Test results saved to checkpoints/STID/Illness_100_96_48/9cd15181d2d202a278536bfd1f1031a0/test_results.npz.
2024-09-13 10:27:08,659 - easytorch-training - INFO - Test metrics saved to checkpoints/STID/Illness_100_96_48/9cd15181d2d202a278536bfd1f1031a0/test_metrics.json.

Validation dataset is too short, enabling overlap. See details in /home/S22/workspace/BasicTS/basicts/data/simple_tsf_dataset.py at line 96. 代表此时测试集不够长，因此将Validation数据集的overlap设置为True。

zezhishao commented 1 month ago

作者您好，数据可视化的代码是否可以更新一下？

已更新

TensorPulse commented 1 month ago

作者您好，您刚更新的子集预测是否有一个问题？如果选择数据反归一化，inverse_transform无法索引选择的self.select_target_time_series

zezhishao commented 1 month ago

感谢您的报告！现在应该已经可以了，您可以试一下

zezhishao commented 1 month ago

@all-contributors please add @TensorPulse for bug

allcontributors[bot] commented 1 month ago

@zezhishao

I've put up a pull request to add @TensorPulse! :tada:

TensorPulse commented 1 month ago

作者您好，关于子集预测问题，假设模型输入为BTN1C,输出为BTN2C，BasicTS好像并不支持

zezhishao commented 1 month ago

作者您好，关于子集预测问题，假设模型输入为B_T_N1_C,输出为B_T_N2_C，BasicTS好像并不支持

麻烦您提供一下更详细的描述，比如您想要实现的具体功能和现有版本之间的差异。

TensorPulse commented 1 month ago

作者您好，关于子集预测问题，假设模型输入为B_T_N1_C,输出为B_T_N2_C，BasicTS好像并不支持

麻烦您提供一下更详细的描述，比如您想要实现的具体功能和现有版本之间的差异。

以PEMS08为例，假设我的模型输入为[64, 12, 170, 1],输出为[64, 12, 1, 1],现有版本首先simple_tsf_runner这里会报错 assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], \ "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." 其次，在数据反归一化inverse_transform无法索引选择的self.select_target_time_series

zezhishao commented 1 month ago

您好，目前实现子集预测主要是通过SimpleTimeSeriesForecastingRunner.select_target_time_series实现的，它要求模型的输入输出的N都是相同的，否则通过CFG.MODEL.TARGET_TIME_SERIES实现。这个功能的加入主要是为了快速实现multivariate预测Univariate。

zezhishao commented 1 month ago

如果您希望模型的输出数据的N和输入数据不同，那么可以通过定制一个Datasets类实现（并取消CFG.MODEL.TARGET_TIME_SERIES的设置）。

TensorPulse commented 1 month ago

如果您希望模型的输出数据的N和输入数据不同，那么可以通过定制一个Datasets类实现（并取消CFG.MODEL.TARGET_TIME_SERIES的设置）。

这个功能可以通过self.if_out_target_nodes参数，它默认为False, 在SimpleTimeSeriesForecastingRunner.forward下通过 if self.target_time_series is not None: if list(model_return['prediction'].shape)[2] == len(self.target_time_series): self.if_out_target_nodes = True assert list(model_return['prediction'].shape)[:3] == [batch_size, length, len(self.target_time_series)], \ "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." else: assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], \ "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." else: assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], \ "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." 自动判断模型输入与输出的N，如果N不同，则self.if_out_target_nodes为True 然后在BaseTimeSeriesForecastingRunner.postprocessing中通过

rescale data

    if self.scaler is not None and self.scaler.rescale:
        input_data['target'] = self.scaler.inverse_transform(input_data['target'])
        input_data['inputs'] = self.scaler.inverse_transform(input_data['inputs'])
        if self.if_out_target_nodes:
            input_data['prediction'] = self.scaler.inverse_transform(input_data['prediction'], target_time_series=self.target_time_series)
        else:
            input_data['prediction'] = self.scaler.inverse_transform(input_data['prediction'])

    # subset forecasting
    if self.target_time_series is not None:
        input_data['target'] = input_data['target'][:, :, self.target_time_series, :]
        if not self.if_out_target_nodes:
            input_data['prediction'] = input_data['prediction'][:, :, self.target_time_series, :]

其中，self.scaler.inverse_transform增加一个索引参数target_time_series 这样BasicTS即可自动识别模型在子集预测中输出数据的N和输入数据的N是否相同的问题

TensorPulse commented 1 month ago

如果您希望模型的输出数据的N和输入数据不同，那么可以通过定制一个Datasets类实现（并取消CFG.MODEL.TARGET_TIME_SERIES的设置）。

这个功能可以通过self.if_out_target_nodes参数，它默认为False, 在SimpleTimeSeriesForecastingRunner.forward下通过 if self.target_time_series is not None: if list(model_return['prediction'].shape)[2] == len(self.target_time_series): self.if_out_target_nodes = True assert list(model_return['prediction'].shape)[:3] == [batch_size, length, len(self.target_time_series)], "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." else: assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." else: assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], "The shape of the output is incorrect. Ensure it matches [B, L, N, C]." 自动判断模型输入与输出的N，如果N不同，则self.if_out_target_nodes为True 然后在BaseTimeSeriesForecastingRunner.postprocessing中通过 # rescale data if self.scaler is not None and self.scaler.rescale: input_data['target'] = self.scaler.inverse_transform(input_data['target']) input_data['inputs'] = self.scaler.inverse_transform(input_data['inputs']) if self.if_out_target_nodes: input_data['prediction'] = self.scaler.inverse_transform(input_data['prediction'], target_time_series=self.target_time_series) else: input_data['prediction'] = self.scaler.inverse_transform(input_data['prediction'])
    # subset forecasting
    if self.target_time_series is not None:
        input_data['target'] = input_data['target'][:, :, self.target_time_series, :]
        if not self.if_out_target_nodes:
            input_data['prediction'] = input_data['prediction'][:, :, self.target_time_series, :]
其中，self.scaler.inverse_transform增加一个索引参数target_time_series 这样BasicTS即可自动识别模型在子集预测中输出数据的N和输入数据的N是否相同的问题

self.scaler.inverse_transform的具体实现方式如下： def inverse_transform(self, input_data: torch.Tensor, target_time_series: List = None) -> torch.Tensor: mean = self.mean.to(input_data.device) std = self.std.to(input_data.device) if target_time_series is not None: mean = mean[:, target_time_series] std = std[:, target_time_series]

Clone the input data to prevent in-place modification (which is not allowed in PyTorch)

    input_data = input_data.clone()
    input_data[..., self.target_channel] = input_data[..., self.target_channel] * std + mean
    return input_data

zezhishao commented 1 month ago

感谢您的提议，但我还是没搞明白这么做的必要性是什么。麻烦您用markdown语法给出完整的代码，目前这个格式完全乱了。

TensorPulse commented 1 month ago

感谢您的提议，但我还是没搞明白这么做的必要性是什么。麻烦您用markdown语法给出完整的代码，目前这个格式完全乱了。

这样做的目的是为了适应单变量预测模型，以ETTh1数据集为例，当预测变量为OT，其他为辅助变量时，有些单变量的模型输出为[B,T,1,1]，这样做可以适应这种模型的移植。

SimpleTimeSeriesForecastingRunner.forward

# Ensure the output shape is correct
if self.target_time_series is not None:
    if list(model_return['prediction'].shape)[2] == len(self.target_time_series):
        self.if_out_target_nodes = True
        assert list(model_return['prediction'].shape)[:3] == [batch_size, length, len(self.target_time_series)], \
            "The shape of the output is incorrect. Ensure it matches [B, L, N, C]."
    else:
        assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], \
            "The shape of the output is incorrect. Ensure it matches [B, L, N, C]."
else:
    assert list(model_return['prediction'].shape)[:3] == [batch_size, length, num_nodes], \
        "The shape of the output is incorrect. Ensure it matches [B, L, N, C]."

BaseTimeSeriesForecastingRunner.postprocessing,在初始化中增加self.if_out_target_nodes = False

        # rescale data
        if self.scaler is not None and self.scaler.rescale:
            input_data['target'] = self.scaler.inverse_transform(input_data['target'])
            input_data['inputs'] = self.scaler.inverse_transform(input_data['inputs'])
            if self.if_out_target_nodes:
                input_data['prediction'] = self.scaler.inverse_transform(input_data['prediction'], target_time_series=self.target_time_series)
            else:
                input_data['prediction'] = self.scaler.inverse_transform(input_data['prediction'])

        # subset forecasting
        if self.target_time_series is not None:
            input_data['target'] = input_data['target'][:, :, self.target_time_series, :]
            if not self.if_out_target_nodes:
                input_data['prediction'] = input_data['prediction'][:, :, self.target_time_series, :]

     #self.scaler.inverse_transform
        def inverse_transform(self, input_data: torch.Tensor, target_time_series: List = None) -> torch.Tensor:
                mean = self.mean.to(input_data.device)
                std = self.std.to(input_data.device)
                if target_time_series is not None:
                    mean = mean[:, target_time_series]
                    std = std[:, target_time_series]
                input_data = input_data.clone()
                input_data[..., self.target_channel] = input_data[..., self.target_channel] * std + mean
                return input_data

zezhishao commented 1 month ago

现在已经支持了，只需要设置CFG.MODEL.TARGET_TIME_SERIES=[6] （以ETT为例）应该就能实现。目前的实现方式是哪里不满足您的使用场景吗？

TensorPulse commented 1 month ago

现在已经支持了，只需要设置CFG.MODEL.TARGET_TIME_SERIES=[6] （以ETT为例）应该就能实现。目前的实现方式是哪里不满足您的使用场景吗？

是的，使用场景要求模型的输出特征为1

zezhishao commented 1 month ago

您好，我仔细思考了一下，非常感谢您的提议，但您的需求不太适合通过直接修改现有架构实现（同一功能的两种实现会引起误会）。

但您可以通过非侵入式的方式快速实现您的需求：

定义新的Runner。继承SimpleTimeSeriesForecastingRunner并重载postprocessing函数。
定义新的Scaler。可以继承现有的Scaler并重载inverse_transform函数。
定义新的Datasets（如果您的待预测样本和历史样本的N大小不同、含义不同的话）

TensorPulse commented 1 month ago

您好，我仔细思考了一下，非常感谢您的提议，但您的需求不太适合通过直接修改现有架构实现（同一功能的两种实现会引起误会）。

但您可以通过非侵入式的方式快速实现您的需求：

定义新的Runner。继承SimpleTimeSeriesForecastingRunner并重载postprocessing函数。

定义新的Scaler。可以继承现有的Scaler并重载inverse_transform函数。

定义新的Datasets（如果您的待预测样本和历史样本的N大小不同、含义不同的话）

确实，这种特殊的适应场景会引起误解。感谢作者您提供完整的非侵入式的思路

zezhishao commented 1 month ago

您好，我仔细思考了一下，非常感谢您的提议，但您的需求不太适合通过直接修改现有架构实现（同一功能的两种实现会引起误会）。但您可以通过非侵入式的方式快速实现您的需求：

定义新的Runner。继承SimpleTimeSeriesForecastingRunner并重载postprocessing函数。

定义新的Scaler。可以继承现有的Scaler并重载inverse_transform函数。

定义新的Datasets（如果您的待预测样本和历史样本的N大小不同、含义不同的话）

确实，这种特殊的适应场景会引起误解。感谢作者您提供完整的非侵入式的思路

感觉您应该也可以通过在模型输出位置，添加一个复制操作来兼容现有的框架。

比如，假设您的模型输出prediction的维度是B, L，可以通过 prediction = prediction.unsqueeze(-1).unsqueeze(-1).repeat(1, 1, self.num_nodes, 1)来得到符合当前架构的输出。此时在config里面设置CFG.MODEL.TARGET_TIME_SERIES为正确的索引，应该就可以了。

虽然反归一化的时候依旧会在非TARGET_TIME_SERIES上进行计算，但在计算指标（以及loss）的时候只取出了目标时间序列，对最终结果不会造成影响，只是做了一些多余的计算。

GestaltCogTeam / BasicTS

一些问题和优化建议 #139

tqdm process bar

test loop

rescale data

Clone the input data to prevent in-place modification (which is not allowed in PyTorch)

SimpleTimeSeriesForecastingRunner.forward