WenjieDu / SAITS

The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516
https://doi.org/10.1016/j.eswa.2023.119619
MIT License
336 stars 51 forks source link

What is the best setup of SAITS to improve imputation results within missing gap and single missing over time data? #44

Open clevilll opened 1 month ago

clevilll commented 1 month ago

Hi, I was experimenting this DL architecture to see how the performance of its imputation over uni-variante time-series data for:

using this setup:

    def __init__(self, columns=None,
                 n_steps=100,
                 n_features=1,   #uni-variante time-series data
                 n_layers=2,
                 d_model=256,
                 n_heads=4,
                 d_k=64,
                 d_v=64,
                 d_ffn=128,
                 dropout=0.1,
                 epochs=100):
...

I have reached the following results:

image

So let's zoom and see the performance of SAITS() with other classic treatments of missing data - X['avgcpu'].interpolate(method=...) as well as Average Median of all instances replacement via X['avgcpu'].median().

Fig. 1: Comparison of imputations for missing gap Fig. 2: Comparison of imputations for single missing sequence

I don't see much differences between imputation of SAITS() and median() especially over missing gaps and comparing results for single missing, results of other classic interpolation fillers (Linear\Nearest) are comperable with SAITS(). I expected at least over missing gap case, DL-based models could perform and replace meaningful values.

I’d appreciate any insights based on your experience if I need to adjust hyper-parameters of SAITS() for further improvement. I also read closed issues in this repo but did not find something helpful about improvement for these missing scenarios.


Note: The resolution of used time data is epoch=5mins (sometimes some models are not good with high-frequency time data)

WenjieDu commented 1 month ago

Hi there,

Thank you so much for your attention to SAITS! If you find SAITS is helpful to your work, please star⭐️ this repository. Your star is your recognition, which can let others notice SAITS. It matters and is definitely a kind of contribution.

I have received your message and will respond ASAP. Thank you again for your patience! 😃

Best,
Wenjie

github-actions[bot] commented 2 weeks ago

This issue had no activity for 14 days. It will be closed in 1 week unless there is some new activity. Is this issue already resolved?

clevilll commented 2 weeks ago

This issue had no activity for 14 days. It will be closed in 1 week unless there is some new activity. Is this issue already resolved?

The problem has not been solved and no answer provided so far from @WenjieDu