thuml / Autoformer

About Code release for "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting" (NeurIPS 2021), https://arxiv.org/abs/2106.13008
MIT License
2k stars 429 forks source link

# Question of training shuffle #118

Closed accuracy-maker closed 1 year ago

accuracy-maker commented 1 year ago

Thank you for sharing your great work. I notice the parameter of training shuffle is True. I think it will break the order of time series. I don't understand

ChuckTG commented 1 year ago

Shuffling is for the batches and not for the actual timesteps. For example you have two batches of your time series and let's say the input of the model has to be of size [n_batches, timesteps](No feature dimension for simplicity). What shuffling will do is shuffle the order that the model receives the batches and not the actual timesteps. Lets say we have two batches with 5 timesteps [[1,2,3,4,5],[0,-1,-2,-3,-4]] (n_batches=2,timesteps=5). We can shuffle the order of the batches and have [[0,-1,-2,-3,-4],[1,2,3,4,5]], but we are not shuffling the actual timesteps [[2,5,3,1,4],[-3,0,-2,-1,-4]] because that would hurt the model. This technique can help reduce overfitting.

accuracy-maker commented 1 year ago

Thank you for answer this, I understand it!

wuhaixu2016 commented 1 year ago

Thanks for your valuable explanation! @ChuckTG

Zero-coder commented 1 year ago

well done~