nzl5116190 / Basisformer

This is the pytorch implementation of Basisformer in the Neurips paper: [BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis]
78 stars 8 forks source link

Support for Multivariate Input with Univariate Prediction in the Code #8

Closed 619797944 closed 1 month ago

619797944 commented 1 month ago

Hello, I encountered an issue while running the Basisformer model using 'MS' as the feature argument. When I execute the code, I receive the following error:

File "/home/MG3052CL/Basisformer-main/model.py", line 91, in forward l_neg = torch.bmm(logit_q.reshape(-1,self.N,self.k), logit_k.reshape(-1,self.N,self.k).permute(0,2,1)).reshape(-1,self.N) # (B,C*N,N) RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [224, 16] but got: [32, 16].

This dimension mismatch error occurs regardless of the dataset I use. Could you please suggest me a possible solution?

Thank you ~

nzl5116190 commented 1 month ago

Hello, Thank you for your question! The issue you're encountering is primarily due to the setting of MS. In this setting, the shape of x can be represented as (B, C, N), while the shape of y can be represented as (B, 1, N).

Corresponding to the code, we have:

logit_q = score.permute(0,2,3,1)
logit_k = score_y.permute(0,2,3,1)

At this point, the shape of logit_q is (B, C, N, k), and the shape of logit_k is (B, 1, N, k). This discrepancy in shapes causes the mismatch during the matrix multiplication operation.

The original code was designed for multivariate (M) and univariate (S) settings, without considering the MS case, which is why this issue arises. I apologize for the inconvenience.

To resolve this issue, you can add an MLP layer to the network. For instance:

self.MLP_MS = wn(nn.Linear(C, 1))

Here, C represents the number of channels in the x data, which should be passed as a parameter. In your case, it seems to be 7.

You can then transform the basis coefficients from (B, k, C, N) to (B, k, 1, N) using this MLP layer. Specifically, you should modify lines 62-66 of model.py as follows:

score, attn_x1, attn_x2 = self.coefnet(m1, feature)  #(B, k, C, N)
score = self.MLP_MS(score.permute(0,1,3,2)).permute(0,1,3,2)  #(B, k, 1, N)

base = self.MLP_y(raw_m2).reshape(B, self.N, self.k, -1).permute(0, 2, 1, 3)  #(B, k, N, L/k)
out = torch.matmul(score, base).permute(0, 2, 1, 3).reshape(B, C, -1)  #(B, 1, k * (L/k))
out = self.MLP_sy(out).reshape(B, 1, -1).permute(0, 2, 1)   # (B, 1, L)

In theory, this modification should allow the training to proceed smoothly. If you encounter any further issues, please don't hesitate to reach out.

Best regards~