gzerveas / mvts_transformer

Multivariate Time Series Transformer, public version
MIT License
718 stars 169 forks source link

Some difficulties reproducing the results in paper. #36

Closed Guanyunlph closed 1 year ago

Guanyunlph commented 1 year ago

Thank you for your significant contributions to the field of time series, which have given me the opportunity to build on your work in downstream tasks. However, I am facing some challenges in reproducing the results of your paper. I have used the unsupervised pre-training mode to complete the regression of AppliancesEnergy data, and the parameters are also specific values given in your paper (Table 14).

The specific pre-training code is as follows. python src/main.py --output_dir experiments --comment "pretraining through imputation" --name pretrained --records_file Imputation_records.xls --data_dir "/AppliancesEnergy/" --data_class tsra --pattern TRAIN --val_ratio 0.2 --epochs 700 --lr 0.001 --optimizer RAdam --pos_encoding learnable --num_layers 3 --num_heads 16 --d_model 128 --dim_feedforward 512 --batch_size 128

The specific fine-tune code is as follows. python src/main.py --output_dir experiments --comment "finetune for regression" --name finetuned --records_file Regression_records.xls --data_dir /AppliancesEnergy/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 200 --lr 0.001 --optimizer RAdam --pos_encoding learnable --load_model /pretrained_2023-02-17_21-13-58_dtF/checkpoints/model_best.pth --task regression --change_output --num_layers 3 --num_heads 16 --d_model 128 --dim_feedforward 512 --batch_size 128

The specific test code is as follows. python src/main.py --output_dir experiments --comment "test" --name test --data_dir /AppliancesEnergy/ --data_class tsra --load_model /finetuned_2023-02-17_21-28-36_l5O/checkpoints/model_best.pth --pattern TEST --test_only testset --num_layers 3 --num_heads 16 --d_model 128 --dim_feedforward 512 --batch_size 128 --task regression

I have run these three steps (pre-training, fine-tuning, and testing) several times, but each time the test loss results have a large difference with that in paper. I saw that there is a similar situation in issue 19 (Test result in multivariate dataset without pretrain #19), and he said he solved it this way (He search best epoch and train the model with such epoch and the whole train set). However, I did not understand his meaning indeed. Would you be able to provide a detailed explanation or some other advice to me as a beginner?

the RMSE pretrained reslut of AppliancesEnergy is 2.375 (Table 4) but i got thes test result: Test Summary: loss: 11.078025 Test Summary: loss: 11.406574 | Test Summary: loss: 10.409667

gzerveas commented 1 year ago

Okay, so the first thing to keep in mind is that the values reported by the code (as mentioned in the README), are always MSE, not RMSE, so you should take the square root at the end, which means that the last RMSE loss in your experiments would be about sqrt(10.41) = 3.226.

Secondly, and very importantly, you should allow fine-tuning for a sufficient number of epochs (e.g. 700 for this dataset, not 200 - you can monitor how the loss evolves). In this dataset, I got the best results after more than 600 epochs of fine-tuning.

Finally, specifically for this dataset, the result for the pre-trained / fine-tuned transformer was achieved with a batch size of 64, not 128; the rest of the hyperparameters are correct. A couple of datasets, including this one, displayed a better performance with a batch size different from 128 (typically 64 or 32) and for this reason the batch size should have been probably mentioned as a separate hyperparameter - but unfortunately never was.

If you take care of these 3 points, I believe you will get the performance you desire.

Guanyunlph commented 1 year ago

Thank you very much for your patient guidance. Following your advice, I was able to reproduce the results of the paper perfectly. I was lucky to have encountered a patient and friendly author and such a beautiful piece of work. Hyperparameters can be both loved and hated, do you have any tips for tuning them? Or could you recommend some tuning tools? Thank you again.

gzerveas commented 1 year ago

Hi @Guanyunlph , thank you for your kind words. I am glad you could perfectly reproduce the results. I think your last post is a completely different topic, so could you please remove if from this thread and open a new issue? (It helps with discoverability, visibility etc). I will try to answer with a few thoughts.

Guanyunlph commented 1 year ago

Well, I'd be happy to do that.