timeseriesAI / tsai

Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
https://timeseriesai.github.io/tsai/
Apache License 2.0
4.95k stars 625 forks source link

PatchTST with y shape of (n_samples, n_outputs) #738

Open asadabbas09 opened 1 year ago

asadabbas09 commented 1 year ago

Thanks for implementing PatchTST :)

I'm trying to follow the tutorial provided for PatchTST.

The documentation of TSForecaster says:

X: array-like of shape (n_samples, n_steps) or (n_samples, n_features, n_steps) with the input time series samples. Internally, they will be converted to torch tensors.
y: array-like of shape (n_samples), (n_samples, n_outputs) or (n_samples, n_features, n_outputs) with the target. Internally, they will be converted to torch tensors. Default=None. None is used for unlabeled datasets.

The accompanied tutorial works well when we have

X.shape, y.shape ((803, 7, 104), (803, 7, 60)) But when we have y.shape of (803, 60), it throws an error:

RuntimeError: The size of tensor a (6720) must match the size of tensor b (960) at non-singleton dimension 0 Shouldn't it still work as TSForecaster can take y of shape (n_samples, n_outputs)?

As I'm currently using TSTPlus and it can work with the y.shape of (n_samples, n_outputs) for my own dataset.

oguiza commented 1 year ago

Hi @asadabbas09, You are right. That should work. I'll take a look at it as soon as I can. As a workaround, you can always use a y of shape (n_samples, 1, n_outputs) since you are predicting 1 variable only.

asadabbas09 commented 1 year ago

Thanks, @oguiza I tried (803, 1, 60) i.e. (n_samples, 1, n_outputs), but still got the same error.

oguiza commented 1 year ago

Hi @asadabbas09, I didn't realize that the number of channels in X and y is different (7 and 1). For now, only PatchTST is added to the library. PatchTST is based on the original paper code. The issue is that PatchTST implements univariate and multivariate predictions as long as the number of input and output channels is the same. I'm planning to implement a variation of the original code (PatchTSTPlus) that will allow you to create a univariate forecast based on a multivariate input.

asadabbas09 commented 1 year ago

Thanks @oguiza ,

Still, for me it's a bit confusing, I'm following the discussion at PatchTST https://github.com/yuqinie98/PatchTST/issues/26

And the author said that it is possible to do multivariate predict univariate

Does this mean that according to ( https://github.com/yuqinie98/PatchTST/issues/26) we can use (803, 7, 104) to predict (803, 1, 60)

I'm not sure if I'm understanding correctly.

oguiza commented 1 year ago

That's true. But If you read the whole paragraph they add: "You would need to modify the head to make sure it outputs the right shape. Please feel free to let us know how it goes." Those are the changes that I plan to automate with PatchTSTPlus. Note: It's not just the head that needs to be modified, as PatchTST makes use of RevIN layers, and those would need to be modified as well.