cure-lab / LTSF-Linear

[AAAI-23 Oral] Official implementation of the paper "Are Transformers Effective for Time Series Forecasting?"
Apache License 2.0
2.03k stars 450 forks source link

Three variants essentially the same model? #87

Open linfeng-du opened 1 year ago

linfeng-du commented 1 year ago

Dear authors,

Since NLinear and DLinear only apply additional linear operations (subtract last and moving average) on top of Linear which does not include any non-linearity. We would get the same results if we're solving via OLS and may get slightly different results via GD since the optimization dynamics may be different due to different matrix compositions. But seems that it wouldn't be that different to serve as a valid inductive bias for the model (as also shown by the results). Please correct me if I'm wrong. Thanks.

dkhonker commented 8 months ago

“ We would get the same results if we're solving via OLS ” Is there any code for this part?

linfeng-du commented 8 months ago

You don't need any code to verify that. If you solve a linear system via OLS the result would be determinsitic.