thuml / iTransformer

Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting" (ICLR 2024 Spotlight), https://openreview.net/forum?id=JePfAI8fah
https://arxiv.org/abs/2310.06625
MIT License
1.28k stars 223 forks source link

Can't run Transformer model instead of iTransformer #129

Closed JinfengM closed 3 weeks ago

JinfengM commented 1 month ago

hi, I'm gonna test the experiment. Unfortunately, I find out that I can't run Transformer. Below is my steps:

  1. using following script to train and create Transformer model , which will be saved in checkin directory,everything is ok. ................................................................................................................................................................................................................................................... export CUDA_VISIBLE_DEVICES=0

model_name=Transformer

python -u run.py \ --is_training 0 \ --root_path ./dataset/weather/ \ --data_path weather.csv \ --model_id weather_96_96 \ --model $model_name \ --data custom \ --features M \ --seq_len 96 \ --pred_len 96 \ --e_layers 3 \ --enc_in 21 \ --dec_in 21 \ --c_out 21 \ --des 'Exp' \ --d_model 512\ --d_ff 512\ --itr 1 ...................................................................................................................................................................................................................................................

  1. I changed 'is_training 0' to 1 to run predict, and I also revoke exp.predict function using code below

exp.predict(setting, load=1)

when predict run, it always have following errors, which is very strange, since test function can be run correctly, why predict can't.

This situation has confused me two weeks ,would you like to tell me why? ...................................................................................................................................................................................................................................................

File "/home/mjf/iTransformer-main/run.py", line 168, in exp.predict(setting, load=1) File "/home/mjf/iTransformer-main/experiments/exp_long_term_forecasting.py", line 319, in predict outputs = self.model(batch_x, batch_x_mark, dec_inp, batch_y_mark) File "/opt/anaconda/envs/vllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/anaconda/envs/vllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "/home/mjf/iTransformer-main/model/Transformer.py", line 89, in forward dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec) File "/home/mjf/iTransformer-main/model/Transformer.py", line 83, in forecast dec_out = self.dec_embedding(x_dec, x_mark_dec) File "/opt/anaconda/envs/vllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/anaconda/envs/vllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/mjf/iTransformer-main/layers/Embed.py", line 125, in forward x = self.value_embedding( RuntimeError: The size of tensor a (96) must match the size of tensor b (144) at non-singleton dimension 1 ...................................................................................................................................................................................................................................................

JinfengM commented 1 month ago

Anyone try Transformer model ?

JinfengM commented 1 month ago

I think I know how Transformer fails to run. Set 96-48-96 weather forecasting as an example. Predicting needs size of seq_y 48(label_len)+96(pred_len),however, Dataset_Pred only provide label_len 48.So you need to extend it to144.

ikhsansdqq commented 1 month ago

Hi, for Transformer I am not using from the iTransformer's author, instead I created the Transformers model directly from Huggingface documentation!