Question about training, validation, testing and missing rates?

WenjieDu / SAITS

The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516

https://doi.org/10.1016/j.eswa.2023.119619

MIT License

292 stars 48 forks source link

Question about training, validation, testing and missing rates? #37

Closed Rajesh90123 closed 1 month ago

Rajesh90123 commented 2 months ago

Greetings, sir. In your research paper, "SAITS: Self-Attention-based Imputation for Time Series", you have presented a table with different missing rates (20% to 90%). So, do you prepare best models from each val_dataset with different missing % rates so as to provide result for corresponding missing % of test or do you save model based on specific missing percentage in validation data (say 20% loss in val data) and use it to prepare result for test for other missing percentage(say 30% artificial missing percentage in test data)?

WenjieDu commented 2 months ago

Hi there,

Thank you so much for your attention to SAITS! If you find SAITS is helpful to your work, please star⭐️ this repository. Your star is your recognition, which can let others notice SAITS. It matters and is definitely a kind of contribution.

I have received your message and will respond ASAP. Thank you again for your patience! 😃

Best,
Wenjie

WenjieDu commented 1 month ago

You can figure this out by reading code in our data preprocessing scripts under the dir dataset_generating_scripts

Rajesh90123 commented 1 month ago

Thank you sir for your response. And do you also find best hyperparameters for each % missing rate of validation set to test for each % missing rate in test dataset?

WenjieDu commented 1 month ago

Yep, this is also clarified in our paper. I suggest you use PyPOTS https://github.com/WenjieDu/PyPOTS, which could be quite useful to you if you work on incomplete time series with missingness.