Open zqiao11 opened 5 months ago
I believe it uses the same train/val/test split as the LSF setting. However, it doesn't perform normalization based on train set statistics, which is used in the LSF setting, so there is a mismatch between the fine-tuning and evaluation. If you want to fine-tune in a multivariate fashion, then yes, process it as a multivariate dataset, and also remove the SampleDimension transformation.
Thanks for your reply. Following your suggestions, I normalized the data for fine-tuning, built the data in 'wide_multivariate' and removed the SampleDimension transformation.
However, when I ran the experiment, an error occurred:
...
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/eee/qzz/uni2ts/venv/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/home/eee/qzz/uni2ts/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
return self.collate_fn(data)
File "/home/eee/qzz/uni2ts/src/uni2ts/data/loader.py", line 106, in __call__
assert all(
AssertionError: Sample length must be less than or equal to max_length (512)
I think this error is caused by using dataset built in 'wide_multivariate' mode. How should I handle this issue? Do I need to modify this max_length
here, and how to calculate this value?
Hi @zqiao11, sorry for the late response. Have you managed to resolve this issue? If not, could you provide more details?
Hi. I haven't resolved this issue, but I have tracked the reason. This issue can happen when a flatten patchfied sequence exceeds the max_length=512
of Moirai. I think this could be common when processing data built in wide_multivariate
.
For example, Etth1 has 7 variates. If I use a context_length=5000, prediction_length=96, and patch_size=64 (same config to reproduce LSF results), then there would be 81 patches for one variate. And after flattening the 7 variates, there are
567 patches ( equals target.size(1)
), exceeding the max_seq_length
of 512.
The assertion error is raised by the sequence packing function, which is only used in training and not used in forecasting. So that is why one can evaluate the model with mode=M
without error.
BTW, is it safe to modify this max_seq_length
? Besides sequence packing, I notice it is also used in the codes related to self-attention.
FYI, you can reproduce this issue by running your example codes of finetuning. Just build the Etth1 dataset with wide_multivariate
, and set context_length=5000, prediction_length=96, and patch_size=64.
@zqiao11 Have you resolved this issue? I'm experiencing the same situation as you.
@zqiao11 Hello, I also finetuned the model in the ETTh1 and its performance decreased significantly. Have you solve this issue after normalizing the ETTH1.
@wyhzunzun123123 Hi, I haven't solved this issue for ETTh1. Since the config for reproduction uses mode='M' in ETTh1, I think one may need to finetune it with dataset built in the multi-variate time series format. But I cannot handle the error caused by 'max_seq_len' and need to wait for the author's reply.
You may consider to finetune the model with ETTm1 dataset, which evals in mode='S' (build the dataset in 'wide').
So sorry for the delayed response, for the max_seq_len
issue, you can use one of the following options:
max_seq_len
parametermax_dim
parameter set appropriately.The idea is that we set a maximum number of tokens, max_seq_len
. This is calculated by (context_len + prediction_len) / patch_size * dim.
Regarding a difference in performance with ETTh1, if you want to evaluate on the LSTF setting, you will have to perform a normalization on the train set statistics first.
Thanks. Can you briefly explain the role of SampleDimension
feature? Does it sample as many dimensions/variates as possible from an MTS with a given limit of max_seq_len
?
It subsamples the variates given the max_dim
parameter. max_seq_len is not given to SampleDimension as a parameter.
@zqiao11 Hi, have you solved the max_seq_len
issues, any experiences with this error? Thank you so much!
Hi. I'm working on enhancing long sequence forecasting performance through finetuning. I have successfully replicated the zero-shot learning results shown in Table 22 and will use them as a baseline for comparison.
For a fair comparison, I need to do finetuning under the same train-val-test setup as the zero-shot experiments in Table 22. However, I am unsure if my approach is accurate. Below is a summary of my workflow to finetune Moirai-small on ETTh1 and evaluate it with prediction length of 96:
conf/finetune/val_data/etth1.yaml
asThen I finetuned a Moirai model with the same command as the example:
Finally, I changed the ckpt in the model's yaml and evaluated the finetuned model by the 2nd approach in the example:
Despite following these steps, the finetuning results are underperforming compared to the zero-shot outcomes (MSE is 0.375 and MAE is 0.402 in the original results).
I have a few questions:
data.mode = M
during testing, do I need to build the dataset withwide_multivariate
for finetuning?Thank you for your assistance.