thuml / iTransformer

Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting" (ICLR 2024 Spotlight), https://openreview.net/forum?id=JePfAI8fah
https://arxiv.org/abs/2310.06625
MIT License
1.17k stars 206 forks source link

Here, are the following two lines redundant? batch_x = batch_x[:, :, partial_start:partial_end] batch_y = batch_y[:, :, partial_start:partial_end] #73

Closed arwooy closed 4 months ago

arwooy commented 5 months ago

The following code snippet is from exp_long_term_forecasting_partial.py:

Variate Generalization training:

            # We train with partial variates (args.enc_in < number of dataset variates)
            # and test the obtained model directly on all variates.
            partial_start = self.args.partial_start_index
            partial_end = min(self.args.enc_in + partial_start, batch_x.shape[-1])
            batch_x = batch_x[:, :, partial_start:partial_end]
            batch_y = batch_y[:, :, partial_start:partial_end]
            # Efficient training strategy: randomly choose part of the variates
            # and only train the model with selected variates in each batch 
            if self.args.efficient_training:
                _, _, N = batch_x.shape
                index = np.stack(random.sample(range(N), N))[-self.args.enc_in:]
                batch_x = batch_x[:, :, index]
                batch_y = batch_y[:, :, index]

Here, are the following two lines redundant? batch_x = batch_x[:, :, partial_start:partial_end] batch_y = batch_y[:, :, partial_start:partial_end]

If every time it's only partial_start:partial_end, doesn't that mean only these variables will be trained?

ZDandsomSP commented 4 months ago

Hello.

Your statement is correct. This part of the code logic is intended to serve our experiment on the generalization of the iTransformer model. We used a fixed 20% variable throughout the training phase and conducted inference on all variables. The experiment showed that the performance degradation in this case was very small, proving that our model has strong generalization between variables.

Our paper involves another efficient training experiment, in which we randomly select 20% of the variables during the training process. image