romilbert / samformer

Official implementation of SAMformer, a transformer leveraging Sharpness-Aware Minimization and Channel-Wise Attention for Time Series Forecasting.
MIT License
70 stars 8 forks source link

Using pytorch version for multivariate regression on variable length 1D output #6

Open erickmiller opened 1 month ago

erickmiller commented 1 month ago

This looks like a really promising project and paper. Very promising results in the paper.

So -- I was testing the implementation in ./samformer_pytorch/ first using run_demo.py and this is working great as it's coded, but when I test it with data other than the sample data provided, I am running into a dimensionality mismatch issue when attempting to train using a multivariate regression style model -- i.e. multiple dimensions going in (in my test case there were 7) and a univariate one dimensional floating point number as the output to learn and/or predict.

From what I can tell after reading parts of the paper and a bit of research, this model, like others should support solving a multivariate regression problem quite nicely, and shouldn't require much modification to get it to work but I was having trouble because I couldn't find and/or figure out how to correctly set the output dimensionality and maintain the predictive power and accuracy of the model.

It seems the model and demo code that shows how to use the model in run_demo.py expects to output predicted values that match the dimensionality of the input features? So for example, if I try to predict 10 timesteps into the future and have 7 features in my dataset, it wants a shape of 70 for the y vector, rather than just a shape of 10. Please correct me if I am making any false assumptions or have interpreted the code wrongly.

In the past I have solved this by deriving a new nn and sandwiching the model with some linear layers like so, with Mamba as an example:

nn.Sequential(
            nn.Linear(num_features, model_dimensions),
            Mamba(self.config),
            nn.Linear(model_dimensions, output_dimensions),
            nn.Tanh()
        )

I am having trouble getting anything to work with SAMFormer and was hoping you could supply a simple example of taking multidimensional input and using it to train on a single dimension (or variable dimension) output of variable length. For example, one use-case is to predict future time values, such as using multiple input dimensions to predict a single dimension's future values N number of time-steps into the future -- and, another arguably much simpler but highly useful and relevant use-case is to train on and predict a simple 1D floating point number of 1 step in length, as a signal that for example fluctuates between -1 and 1, and can be used for decision making in future upstream code, models, prediction systems, etc. This is what I'm trying to test the model's performance for, but can't seem to get beyond the dimensionality errors due to the output dimensionality mismatch.

I thought I'd solve this quickly but, spent an unexpected amount of time on this trying to get this to work last night, but continued to hit dimensionality errors. If anyone on the team could throw together a functional sample python snippet that would run successfully to add the output dimensionality as an argument, or anything that can use multivariate multi-dimensional inputs to predict a sequence of single values into the future, or just a single 1 step, 1D sample to output a single floating point number as the prediction -- that would be super awesome. Thank you in advance.

Also: I suppose I should mention -- there is a bug in run_demo.py with the path used to get the data (the script assumes the data is in cwd but it is one directory up -- so the directory structure must have changed since this was tested, anyways -- it's simple to fix by just adding .. to the "dataset/ETTh1.csv" so the path is "../dataset/ETTh1.csv" -- this is not the purpose of this ticket but figured I should mention it.

erickmiller commented 1 month ago

Tagging @vfeofanov 👍 thanks for your help guys

erickmiller commented 1 month ago

Hey @romilbert @vfeofanov just a friendly ping to hear your thoughts and hopefully any help on this topic?

To give you a bit more information, I kind of hit a wall in terms of trying to manipulate the output dimensionality when it gets to the Reversible Instance Normalization (RevIN) layer at the very end -- it seems this function always transforms the data back to the same number of dimensions as the input dimensionality. I wasn't sure if this part regarding the output dimensionality matching the input dimensionality is implicitly required for the entire model / architecture to work as intended, and was planning on re-reading your paper and the paper(s) about RevIN but haven't had a chance yet.

My intuition is still that your model and architecture should be easily adaptable to support variable output dimensionality, but I'm still struggling to get it to work.

romilbert commented 2 weeks ago

Hi Erik,

You're correct. In the code, RevIN requires that the input and output dimensions match for it to function properly. It's like having an invertible function, which is only possible from R^n to R^m if m=n. This dimensionality requirement ensures that the normalization and de-normalization processes work as intended. If the dimensions don't match, the statistics used by RevIN can't be applied correctly.

Hope this clarifies things

vfeofanov commented 6 days ago

Hi @erickmiller I apologize I haven't replied to you earlier.

For SAMFormerArchitecture's forward method, I changed the code by adding an argument flatten_output that will decide if two or three dimensional output is desired.

To answer your question completely, can I ask you some clarifications? Does the target time series variable belong to the set of variables used as features, or it is a completely different measurement? If it is the first case, then you can simply take the i-th dimension of the output, where i corresponds to the index of the target variable (you need to call forward function with flatten_output=False).

If it is the second case, then, indeed, some modifications are needed. The simplest thing that comes to my mind is to have an additional linear layer after the SAMFormer's output. I can propose to you to try two options: a) let's say out is an output of SAMFormer, which shape is (bs_size, pred_horizon num_channels). You can simply define a linear layer like `nn.Linear(pred_horizon num_channels, pred_horizon). b) another option is to set flatten_output=False, which gives out of a shape (bs_size, num_channels, pred_horizon). Then, you can definelin = nn.Linear(num_channels, 1)and do the following:out = out.transpose(1,2); out = lin(out)`.

Let me know if you have more questions Vasilii