jdb78 / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
3.87k stars 611 forks source link

Question on N-Beats trend coefficients #372

Closed meganset closed 3 years ago

meganset commented 3 years ago

The N-beats trend coefficients are calculated a little differently than other implementations, e.g. in an older repo, https://github.com/philipperemy/n-beats the forecast coefficents for 5 periods ahead, theta dimension of 4 look like:

1 1 1 1 1
0.7894737 1.842105 2.894737 3.947368 5
0.6232687 3.393352 8.379501 15.58172 25 0.4920542 6.250911 24.25645 61.50678 125

whereas pytorch-forecasting uses coefficients (buffer _Tforecast) like:

1.118 1.118 1.118 1.118 1.118
-1.2413e-16 0.1118 0.22361 0.33541 0.44721 1.3781e-32 0.01118 0.044721 0.10062 0.17889 -0 0.001118 0.0089443 0.030187 0.071554

is this an improvement in scaling or nomalization or am i missing something? thanks (thanks for the nice forecasting framework)

jdb78 commented 3 years ago

I think the range of frequencies is different because pytorch forecasting assumes more low frequency and less high frequencies signal in the data.

meganset commented 3 years ago

Thanks for looking at this. I think I confused things by bringing in the calculation from: https://github.com/philipperemy/n-beats which I think is off by a factor in the denominator.

I'll stay with the trend coefficients as calculated by the NBeats repo from the authors of the paper and the coefficients calculated here. The paper authors calculate the trend coefficients (https://github.com/ElementAI/N-BEATS) as:

>>> bn=15; fn=5; p=4  #15 backcasts,5 foreeasts, p=thetas

>>> torch.tensor(np.concatenate([np.power(np.arange(fn) / fn, i)[None, :] for i in range(p)]))
tensor([[1.0000, 1.0000, 1.0000, 1.0000, 1.0000],
        [0.0000, 0.2000, 0.4000, 0.6000, 0.8000],
        [0.0000, 0.0400, 0.1600, 0.3600, 0.6400],
        [0.0000, 0.0080, 0.0640, 0.2160, 0.5120]], dtype=torch.float64)

pytorch-forecasting has a somewhat different approach:

>>> b,f=linspace(bn,fn,centered=True)
>>> f
array([0., 0.06666667, 0.13333334, 0.2, 0.26666668],   dtype=float32)

>>> torch.tensor([f ** i for i in range(p)])
tensor([[1 1         1         1     1    ],   
         0 0.066667  0.13333   0.2   0.26667 ],
         0 0.0044444 0.017778  0.04  0.071111],
         0 0.0002963 0.0023704 0.008 0.018963]])

These trend coefficients for the forecast are not quite the same as the paper's specifications (on page 5) If I adjust the denominator used in function linspace, which divides by the max of number of backcasts and forecasts, I get the same factors used by the NBeats paper, e.g.

>>> F=(f*bn)/fn

>>> torch.tensor([F ** i for i in range(p)])
tensor([[1.0000, 1.0000, 1.0000, 1.0000, 1.0000],
        [0.0000, 0.2000, 0.4000, 0.6000, 0.8000],
        [0.0000, 0.0400, 0.1600, 0.3600, 0.6400],
        [0.0000, 0.0080, 0.0640, 0.2160, 0.5120]])

Is the choice to divide by the _backcastlength a design choice?

jdb78 commented 3 years ago

Yes. It is a design choice. The idea is that your backcast length is more relevant for the periodicity of the signal than the forecast length. Essentially this focuses the model more on slow moving signals as you would expect in the real world.