thuml / iTransformer

Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting" (ICLR 2024 Spotlight), https://openreview.net/forum?id=JePfAI8fah
https://arxiv.org/abs/2310.06625
MIT License
1.17k stars 206 forks source link

got killed #82

Closed wordcount closed 1 month ago

wordcount commented 3 months ago

bash ./scripts/multivariate_forecasting/Traffic/iTransformer.sh Args in experiment: Namespace(is_training=1, model_id='traffic_96_96', model='iTransformer', data='custom', root_path='./dataset/traffic/', data_path='traffic.csv', features='M', target='OT', freq='h', checkpoints='./checkpoints/', seq_len=96, label_len=48, pred_len=96, enc_in=862, dec_in=862, c_out=862, d_model=512, n_heads=8, e_layers=4, d_layers=1, d_ff=512, moving_avg=25, factor=1, distil=True, dropout=0.1, embed='timeF', activation='gelu', output_attention=False, do_predict=False, num_workers=10, itr=1, train_epochs=10, batch_size=16, patience=3, learning_rate=0.001, des='Exp', loss='MSE', lradj='type1', use_amp=False, use_gpu=False, gpu=0, use_multi_gpu=False, devices='0,1,2,3', exp_name='MTSF', channel_independence=False, inverse=False, class_strategy='projection', target_root_path='./data/electricity/', target_data_path='electricity.csv', efficient_training=False, use_norm=True, partial_start_index=0) Use CPU

start training : traffic_96_96_iTransformer_custom_M_ft96_sl48_ll96_pl512_dm8_nh4_el1_dl512_df1_fctimeF_ebTrue_dtExp_projection_0>>>>>>>>>>>>>>>>>>>>>>>>>> train 12089 val 1661 test 3413 ./scripts/multivariate_forecasting/Traffic/iTransformer.sh: line 24: 633 Killed python -u run.py --is_training 1 --root_path ./dataset/traffic/ --data_path traffic.csv --model_id traffic_96_96 --model $model_name --data custom --features M --seq_len 96 --pred_len 96 --e_layers 4 --enc_in 862 --dec_in 862 --c_out 862 --des 'Exp' --d_model 512 --d_ff 512 --batch_size 16 --learning_rate 0.001 --itr 1 Args in experiment: Namespace(is_training=1, model_id='traffic_96_192', model='iTransformer', data='custom', root_path='./dataset/traffic/', data_path='traffic.csv', features='M', target='OT', freq='h', checkpoints='./checkpoints/', seq_len=96, label_len=48, pred_len=192, enc_in=862, dec_in=862, c_out=862, d_model=512, n_heads=8, e_layers=4, d_layers=1, d_ff=512, moving_avg=25, factor=1, distil=True, dropout=0.1, embed='timeF', activation='gelu', output_attention=False, do_predict=False, num_workers=10, itr=1, train_epochs=10, batch_size=16, patience=3, learning_rate=0.001, des='Exp', loss='MSE', lradj='type1', use_amp=False, use_gpu=False, gpu=0, use_multi_gpu=False, devices='0,1,2,3', exp_name='MTSF', channel_independence=False, inverse=False, class_strategy='projection', target_root_path='./data/electricity/', target_data_path='electricity.csv', efficient_training=False, use_norm=True, partial_start_index=0) Use CPU start training : traffic_96_192_iTransformer_custom_M_ft96_sl48_ll192_pl512_dm8_nh4_el1_dl512_df1_fctimeF_ebTrue_dtExp_projection_0>>>>>>>>>>>>>>>>>>>>>>>>>> train 11993 val 1565 test 3317 ./scripts/multivariate_forecasting/Traffic/iTransformer.sh: line 45: 916 Killed python -u run.py --is_training 1 --root_path ./dataset/traffic/ --data_path traffic.csv --model_id traffic_96_192 --model $model_name --data custom --features M --seq_len 96 --pred_len 192 --e_layers 4 --enc_in 862 --dec_in 862 --c_out 862 --des 'Exp' --d_model 512 --d_ff 512 --batch_size 16 --learning_rate 0.001 --itr 1 Args in experiment: Namespace(is_training=1, model_id='traffic_96_336', model='iTransformer', data='custom', root_path='./dataset/traffic/', data_path='traffic.csv', features='M', target='OT', freq='h', checkpoints='./checkpoints/', seq_len=96, label_len=48, pred_len=336, enc_in=862, dec_in=862, c_out=862, d_model=512, n_heads=8, e_layers=4, d_layers=1, d_ff=512, moving_avg=25, factor=1, distil=True, dropout=0.1, embed='timeF', activation='gelu', output_attention=False, do_predict=False, num_workers=10, itr=1, train_epochs=10, batch_size=16, patience=3, learning_rate=0.001, des='Exp', loss='MSE', lradj='type1', use_amp=False, use_gpu=False, gpu=0, use_multi_gpu=False, devices='0,1,2,3', exp_name='MTSF', channel_independence=False, inverse=False, class_strategy='projection', target_root_path='./data/electricity/', target_data_path='electricity.csv', efficient_training=False, use_norm=True, partial_start_index=0) Use CPU start training : traffic_96_336_iTransformer_custom_M_ft96_sl48_ll336_pl512_dm8_nh4_el1_dl512_df1_fctimeF_ebTrue_dtExp_projection_0>>>>>>>>>>>>>>>>>>>>>>>>>> train 11849 val 1421 test 3173 ./scripts/multivariate_forecasting/Traffic/iTransformer.sh: line 66: 1241 Killed python -u run.py --is_training 1 --root_path ./dataset/traffic/ --data_path traffic.csv --model_id traffic_96_336 --model $model_name --data custom --features M --seq_len 96 --pred_len 336 --e_layers 4 --enc_in 862 --dec_in 862 --c_out 862 --des 'Exp' --d_model 512 --d_ff 512 --batch_size 16 --learning_rate 0.001 --itr 1 Args in experiment: Namespace(is_training=1, model_id='traffic_96_720', model='iTransformer', data='custom', root_path='./dataset/traffic/', data_path='traffic.csv', features='M', target='OT', freq='h', checkpoints='./checkpoints/', seq_len=96, label_len=48, pred_len=720, enc_in=862, dec_in=862, c_out=862, d_model=512, n_heads=8, e_layers=4, d_layers=1, d_ff=512, moving_avg=25, factor=1, distil=True, dropout=0.1, embed='timeF', activation='gelu', output_attention=False, do_predict=False, num_workers=10, itr=1, train_epochs=10, batch_size=16, patience=3, learning_rate=0.001, des='Exp', loss='MSE', lradj='type1', use_amp=False, use_gpu=False, gpu=0, use_multi_gpu=False, devices='0,1,2,3', exp_name='MTSF', channel_independence=False, inverse=False, class_strategy='projection', target_root_path='./data/electricity/', target_data_path='electricity.csv', efficient_training=False, use_norm=True, partial_start_index=0) Use CPU start training : traffic_96_720_iTransformer_custom_M_ft96_sl48_ll720_pl512_dm8_nh4_el1_dl512_df1_fctimeF_ebTrue_dtExp_projection_0>>>>>>>>>>>>>>>>>>>>>>>>>> train 11465 val 1037 test 2789 ./scripts/multivariate_forecasting/Traffic/iTransformer.sh: line 87: 1538 Killed python -u run.py --is_training 1 --root_path ./dataset/traffic/ --data_path traffic.csv --model_id traffic_96_720 --model $model_name --data custom --features M --seq_len 96 --pred_len 720 --e_layers 4 --enc_in 862 --dec_in 862 --c_out 862 --des 'Exp' --d_model 512 --d_ff 512 --batch_size 16 --learning_rate 0.001 --itr 1

WenWeiTHU commented 3 months ago

It might be caused by an OOM problem. You can monitor the occupancy of the device and reduce the batch size accordingly.