Closed lilailai688 closed 6 months ago
The parameters are right. Which dataset?
BTW, a kind suggestion, it's better to provide detailed information when you asking a question. The easiest way is, just paste your training log here. Now I have no idea what the situation is and what problem you may have encountered.
PEMS08
Trainset: x-(10700, 12, 170, 3) y-(10700, 12, 170, 1)
Valset: x-(3567, 12, 170, 3) y-(3567, 12, 170, 1)
Testset: x-(3566, 12, 170, 3) y-(3566, 12, 170, 1)
--------- STAEformer ---------
{
"num_nodes": 170,
"in_steps": 12,
"out_steps": 12,
"train_size": 0.6,
"val_size": 0.2,
"time_of_day": true,
"day_of_week": true,
"lr": 0.001,
"weight_decay": 0.0015,
"milestones": [
25,
45,
65
],
"lr_decay_rate": 0.1,
"batch_size": 16,
"max_epochs": 300,
"early_stop": 30,
"use_cl": false,
"cl_step_size": 2500,
"model_args": {
"num_nodes": 170,
"in_steps": 12,
"out_steps": 12,
"steps_per_day": 288,
"input_dim": 3,
"output_dim": 1,
"input_embedding_dim": 24,
"tod_embedding_dim": 24,
"dow_embedding_dim": 24,
"spatial_embedding_dim": 0,
"adaptive_embedding_dim": 80,
"feed_forward_dim": 256,
"num_heads": 4,
"num_layers": 3,
"dropout": 0.1
}
}
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
STAEformer [16, 12, 170, 1] 163,200
├─Linear: 1-1 [16, 12, 170, 24] 96
├─Embedding: 1-2 [16, 12, 170, 24] 6,912
├─Embedding: 1-3 [16, 12, 170, 24] 168
├─ModuleList: 1-4 -- --
│ └─SelfAttentionLayer: 2-1 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-1 [16, 170, 12, 152] 97,394
│ │ └─Dropout: 3-2 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-3 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-4 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-5 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-6 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-2 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-7 [16, 170, 12, 152] 97,394
│ │ └─Dropout: 3-8 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-9 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-10 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-11 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-12 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-3 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-13 [16, 170, 12, 152] 97,394
│ │ └─Dropout: 3-14 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-15 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-16 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-17 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-18 [16, 170, 12, 152] 304
├─ModuleList: 1-5 -- --
│ └─SelfAttentionLayer: 2-4 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-19 [16, 12, 170, 152] 97,394
│ │ └─Dropout: 3-20 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-21 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-22 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-23 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-24 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-5 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-25 [16, 12, 170, 152] 97,394
│ │ └─Dropout: 3-26 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-27 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-28 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-29 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-30 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-6 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-31 [16, 12, 170, 152] 97,394
│ │ └─Dropout: 3-32 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-33 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-34 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-35 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-36 [16, 12, 170, 152] 304
├─Linear: 1-6 [16, 170, 12] 21,900
==========================================================================================
Total params: 1,249,680
Trainable params: 1,249,680
Non-trainable params: 0
Total mult-adds (M): 16.96
==========================================================================================
Input size (MB): 0.39
Forward/backward pass size (MB): 2087.13
Params size (MB): 4.24
Estimated Total Size (MB): 2091.76
==========================================================================================
Loss: HuberLoss
2023-11-04 18:27:54.331792 Epoch 1 Train Loss = 109.37775 Val Loss = 120.59650
2023-11-04 18:28:49.342143 Epoch 2 Train Loss = 120.37145 Val Loss = 121.61814
2023-11-04 18:29:44.688708 Epoch 3 Train Loss = 120.15105 Val Loss = 121.71912
2023-11-04 18:30:40.250788 Epoch 4 Train Loss = 120.01890 Val Loss = 121.76399
2023-11-04 18:31:35.455355 Epoch 5 Train Loss = 120.14912 Val Loss = 121.60905
2023-11-04 18:32:31.003555 Epoch 6 Train Loss = 120.00636 Val Loss = 121.66571
2023-11-04 18:33:26.808337 Epoch 7 Train Loss = 120.03649 Val Loss = 121.67662
2023-11-04 18:34:22.111791 Epoch 8 Train Loss = 119.98375 Val Loss = 121.61085
2023-11-04 18:35:17.796124 Epoch 9 Train Loss = 119.98925 Val Loss = 121.80320
2023-11-04 18:36:12.571842 Epoch 10 Train Loss = 120.03527 Val Loss = 121.65998
2023-11-04 18:37:08.073542 Epoch 11 Train Loss = 119.91024 Val Loss = 121.66051
2023-11-04 18:38:03.403640 Epoch 12 Train Loss = 119.93934 Val Loss = 121.70652
2023-11-04 18:38:58.508955 Epoch 13 Train Loss = 119.96908 Val Loss = 121.68416
2023-11-04 18:39:53.776979 Epoch 14 Train Loss = 119.90834 Val Loss = 121.63253
2023-11-04 18:40:49.292259 Epoch 15 Train Loss = 119.89959 Val Loss = 121.66981
2023-11-04 18:41:44.540982 Epoch 16 Train Loss = 119.92112 Val Loss = 121.68546
2023-11-04 18:42:39.659373 Epoch 17 Train Loss = 119.95554 Val Loss = 121.66123
2023-11-04 18:43:35.208539 Epoch 18 Train Loss = 119.91257 Val Loss = 121.61867
2023-11-04 18:44:30.286884 Epoch 19 Train Loss = 120.01186 Val Loss = 121.65050
2023-11-04 18:45:25.566332 Epoch 20 Train Loss = 119.86794 Val Loss = 121.63741
2023-11-04 18:46:20.569113 Epoch 21 Train Loss = 120.10374 Val Loss = 121.92589
2023-11-04 18:47:16.241771 Epoch 22 Train Loss = 119.90238 Val Loss = 121.69902
2023-11-04 18:48:12.001550 Epoch 23 Train Loss = 119.88293 Val Loss = 121.66576
2023-11-04 18:49:07.386706 Epoch 24 Train Loss = 119.87113 Val Loss = 121.60922
2023-11-04 18:50:02.721957 Epoch 25 Train Loss = 119.89076 Val Loss = 121.67981
2023-11-04 18:50:57.857041 Epoch 26 Train Loss = 119.85204 Val Loss = 121.66775
2023-11-04 18:51:52.885819 Epoch 27 Train Loss = 119.84749 Val Loss = 121.66312
2023-11-04 18:52:48.080652 Epoch 28 Train Loss = 119.84764 Val Loss = 121.66338
2023-11-04 18:53:43.643754 Epoch 29 Train Loss = 119.84717 Val Loss = 121.66476
2023-11-04 18:54:38.648379 Epoch 30 Train Loss = 119.84842 Val Loss = 121.66462
2023-11-04 18:55:34.026405 Epoch 31 Train Loss = 119.84667 Val Loss = 121.66586
Early stopping at epoch: 31
Best at epoch 1:
Train Loss = 109.37775
Train RMSE = 146.22975, MAE = 120.02245, MAPE = 230.68023
Val Loss = 120.59650
Val RMSE = 148.71773, MAE = 121.81441, MAPE = 232.21247
Saved Model: ../saved_models/STAEformer-PEMS08-2023-11-04-18-26-54.pt
--------- Test ---------
All Steps RMSE = 147.13710, MAE = 120.61530, MAPE = 228.22506
Step 1 RMSE = 147.24420, MAE = 120.70740, MAPE = 227.69287
Step 2 RMSE = 147.22301, MAE = 120.68974, MAPE = 227.85320
Step 3 RMSE = 147.20938, MAE = 120.67343, MAPE = 227.90020
Step 4 RMSE = 147.19093, MAE = 120.65603, MAPE = 227.95112
Step 5 RMSE = 147.16719, MAE = 120.63870, MAPE = 228.04577
Step 6 RMSE = 147.14291, MAE = 120.62099, MAPE = 228.17717
Step 7 RMSE = 147.12712, MAE = 120.60538, MAPE = 228.27802
Step 8 RMSE = 147.11211, MAE = 120.58981, MAPE = 228.33850
Step 9 RMSE = 147.08528, MAE = 120.57293, MAPE = 228.51098
Step 10 RMSE = 147.07584, MAE = 120.55881, MAPE = 228.49448
Step 11 RMSE = 147.04865, MAE = 120.54302, MAPE = 228.64282
Step 12 RMSE = 147.01773, MAE = 120.52732, MAPE = 228.81563
Inference time: 5.67 s
Whether it's an LR bug or not, I try 0.0001 is better
No, I think. The model is just not training at all, nothing to do with LR.
I wonder if you modified the model code (STAEformer.py) because the number of parameters changed.
Here:
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
STAEformer [16, 12, 170, 1] 163,200
├─Linear: 1-1 [16, 12, 170, 24] 96
├─Embedding: 1-2 [16, 12, 170, 24] 6,912
├─Embedding: 1-3 [16, 12, 170, 24] 168
├─ModuleList: 1-4 -- --
│ └─SelfAttentionLayer: 2-1 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-1 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-2 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-3 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-4 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-5 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-6 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-2 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-7 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-8 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-9 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-10 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-11 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-12 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-3 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-13 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-14 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-15 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-16 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-17 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-18 [16, 170, 12, 152] 304
├─ModuleList: 1-5 -- --
│ └─SelfAttentionLayer: 2-4 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-19 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-20 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-21 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-22 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-23 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-24 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-5 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-25 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-26 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-27 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-28 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-29 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-30 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-6 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-31 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-32 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-33 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-34 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-35 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-36 [16, 12, 170, 152] 304
├─Linear: 1-6 [16, 170, 12] 21,900
==========================================================================================
Total params: 1,223,460
Trainable params: 1,223,460
Non-trainable params: 0
Total mult-adds (M): 16.96
==========================================================================================
Input size (MB): 0.39
Forward/backward pass size (MB): 2087.13
Params size (MB): 4.24
Estimated Total Size (MB): 2091.76
==========================================================================================
Total params: 1,223,460. But you got 1,249,680.
Yes, you are right.
OK. Feel free to ask if you have other questions :)
PEMS08
Trainset: x-(10700, 12, 170, 3) y-(10700, 12, 170, 1)
Valset: x-(3567, 12, 170, 3) y-(3567, 12, 170, 1)
Testset: x-(3566, 12, 170, 3) y-(3566, 12, 170, 1)
--------- STAEformer ---------
{
"num_nodes": 170,
"in_steps": 12,
"out_steps": 12,
"train_size": 0.6,
"val_size": 0.2,
"time_of_day": true,
"day_of_week": true,
"lr": 0.001,
"weight_decay": 0.0015,
"milestones": [
25,
45,
65
],
"lr_decay_rate": 0.1,
"batch_size": 16,
"max_epochs": 300,
"early_stop": 30,
"use_cl": false,
"cl_step_size": 2500,
"model_args": {
"num_nodes": 170,
"in_steps": 12,
"out_steps": 12,
"steps_per_day": 288,
"input_dim": 3,
"output_dim": 1,
"input_embedding_dim": 24,
"tod_embedding_dim": 24,
"dow_embedding_dim": 24,
"spatial_embedding_dim": 0,
"adaptive_embedding_dim": 80,
"feed_forward_dim": 256,
"num_heads": 4,
"num_layers": 3,
"dropout": 0.1
}
}
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
STAEformer [16, 12, 170, 1] 163,200
├─Linear: 1-1 [16, 12, 170, 24] 96
├─Embedding: 1-2 [16, 12, 170, 24] 6,912
├─Embedding: 1-3 [16, 12, 170, 24] 168
├─ModuleList: 1-4 -- --
│ └─SelfAttentionLayer: 2-1 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-1 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-2 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-3 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-4 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-5 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-6 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-2 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-7 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-8 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-9 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-10 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-11 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-12 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-3 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-13 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-14 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-15 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-16 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-17 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-18 [16, 170, 12, 152] 304
├─ModuleList: 1-5 -- --
│ └─SelfAttentionLayer: 2-4 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-19 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-20 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-21 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-22 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-23 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-24 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-5 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-25 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-26 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-27 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-28 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-29 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-30 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-6 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-31 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-32 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-33 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-34 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-35 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-36 [16, 12, 170, 152] 304
├─Linear: 1-6 [16, 170, 12] 21,900
==========================================================================================
Total params: 1,223,460
Trainable params: 1,223,460
Non-trainable params: 0
Total mult-adds (M): 16.96
==========================================================================================
Input size (MB): 0.39
Forward/backward pass size (MB): 2087.13
Params size (MB): 4.24
Estimated Total Size (MB): 2091.76
==========================================================================================
Loss: HuberLoss
2023-11-04 22:22:56.989962 Epoch 1 Train Loss = 26.56447 Val Loss = 22.67918
2023-11-04 22:23:46.090357 Epoch 2 Train Loss = 18.76599 Val Loss = 19.49613
2023-11-04 22:24:35.602263 Epoch 3 Train Loss = 17.90739 Val Loss = 16.70756
2023-11-04 22:25:25.215039 Epoch 4 Train Loss = 16.63579 Val Loss = 16.69739
2023-11-04 22:26:15.339557 Epoch 5 Train Loss = 16.37046 Val Loss = 16.10194
2023-11-04 22:27:05.180984 Epoch 6 Train Loss = 15.74722 Val Loss = 15.37927
2023-11-04 22:27:54.527145 Epoch 7 Train Loss = 15.32337 Val Loss = 15.01879
2023-11-04 22:28:44.174618 Epoch 8 Train Loss = 15.10617 Val Loss = 14.77142
2023-11-04 22:29:34.048411 Epoch 9 Train Loss = 14.86726 Val Loss = 14.74364
2023-11-04 22:30:23.114273 Epoch 10 Train Loss = 14.72955 Val Loss = 14.63478
2023-11-04 22:31:13.083188 Epoch 11 Train Loss = 14.49073 Val Loss = 14.78110
2023-11-04 22:32:03.293874 Epoch 12 Train Loss = 14.39638 Val Loss = 14.81869
2023-11-04 22:32:52.861439 Epoch 13 Train Loss = 14.30188 Val Loss = 15.40232
2023-11-04 22:33:42.268881 Epoch 14 Train Loss = 14.15652 Val Loss = 14.05722
2023-11-04 22:34:31.912538 Epoch 15 Train Loss = 13.98045 Val Loss = 14.04077
2023-11-04 22:35:21.555447 Epoch 16 Train Loss = 13.90123 Val Loss = 14.18517
2023-11-04 22:36:11.153330 Epoch 17 Train Loss = 13.83669 Val Loss = 14.59372
2023-11-04 22:37:00.967877 Epoch 18 Train Loss = 13.78007 Val Loss = 13.89624
2023-11-04 22:37:50.367475 Epoch 19 Train Loss = 13.69647 Val Loss = 13.87162
2023-11-04 22:38:39.797129 Epoch 20 Train Loss = 13.51429 Val Loss = 13.94320
2023-11-04 22:39:29.266244 Epoch 21 Train Loss = 13.54230 Val Loss = 13.97549
2023-11-04 22:40:19.109974 Epoch 22 Train Loss = 13.44518 Val Loss = 13.68561
2023-11-04 22:41:08.645677 Epoch 23 Train Loss = 13.35127 Val Loss = 14.25897
2023-11-04 22:41:58.257568 Epoch 24 Train Loss = 13.31230 Val Loss = 13.89145
2023-11-04 22:42:47.686874 Epoch 25 Train Loss = 13.28758 Val Loss = 13.70837
2023-11-04 22:43:37.343137 Epoch 26 Train Loss = 12.62391 Val Loss = 13.22810
2023-11-04 22:44:27.142237 Epoch 27 Train Loss = 12.54381 Val Loss = 13.17619
2023-11-04 22:45:16.934709 Epoch 28 Train Loss = 12.51562 Val Loss = 13.23723
2023-11-04 22:46:06.165762 Epoch 29 Train Loss = 12.49374 Val Loss = 13.23450
2023-11-04 22:46:55.880830 Epoch 30 Train Loss = 12.47135 Val Loss = 13.16035
2023-11-04 22:47:45.629951 Epoch 31 Train Loss = 12.45716 Val Loss = 13.22258
2023-11-04 22:48:35.261119 Epoch 32 Train Loss = 12.43723 Val Loss = 13.18819
2023-11-04 22:49:25.274371 Epoch 33 Train Loss = 12.42302 Val Loss = 13.14515
2023-11-04 22:50:15.011436 Epoch 34 Train Loss = 12.41303 Val Loss = 13.14385
2023-11-04 22:51:04.754964 Epoch 35 Train Loss = 12.39326 Val Loss = 13.18016
2023-11-04 22:51:54.523198 Epoch 36 Train Loss = 12.38184 Val Loss = 13.15507
2023-11-04 22:52:44.106783 Epoch 37 Train Loss = 12.36790 Val Loss = 13.16268
2023-11-04 22:53:34.006385 Epoch 38 Train Loss = 12.35726 Val Loss = 13.16237
2023-11-04 22:54:24.051829 Epoch 39 Train Loss = 12.34837 Val Loss = 13.15964
2023-11-04 22:55:13.833324 Epoch 40 Train Loss = 12.33750 Val Loss = 13.11827
2023-11-04 22:56:03.902229 Epoch 41 Train Loss = 12.32789 Val Loss = 13.15559
2023-11-04 22:56:53.767915 Epoch 42 Train Loss = 12.31515 Val Loss = 13.19749
2023-11-04 22:57:44.008861 Epoch 43 Train Loss = 12.30980 Val Loss = 13.20676
2023-11-04 22:58:34.280960 Epoch 44 Train Loss = 12.29574 Val Loss = 13.19759
2023-11-04 22:59:24.233820 Epoch 45 Train Loss = 12.28626 Val Loss = 13.13427
2023-11-04 23:00:14.103778 Epoch 46 Train Loss = 12.21365 Val Loss = 13.10037
2023-11-04 23:01:04.039677 Epoch 47 Train Loss = 12.20668 Val Loss = 13.10854
2023-11-04 23:01:53.736883 Epoch 48 Train Loss = 12.20404 Val Loss = 13.10801
2023-11-04 23:02:43.489697 Epoch 49 Train Loss = 12.20174 Val Loss = 13.09674
2023-11-04 23:03:33.770341 Epoch 50 Train Loss = 12.19923 Val Loss = 13.11514
2023-11-04 23:04:23.625164 Epoch 51 Train Loss = 12.19944 Val Loss = 13.10070
2023-11-04 23:05:13.397919 Epoch 52 Train Loss = 12.19409 Val Loss = 13.09716
2023-11-04 23:06:03.153942 Epoch 53 Train Loss = 12.19393 Val Loss = 13.10942
2023-11-04 23:06:53.262505 Epoch 54 Train Loss = 12.19492 Val Loss = 13.11126
2023-11-04 23:07:42.998320 Epoch 55 Train Loss = 12.19056 Val Loss = 13.11594
2023-11-04 23:08:32.977478 Epoch 56 Train Loss = 12.19043 Val Loss = 13.11762
2023-11-04 23:09:22.608203 Epoch 57 Train Loss = 12.18761 Val Loss = 13.10149
2023-11-04 23:10:12.500485 Epoch 58 Train Loss = 12.18659 Val Loss = 13.11415
2023-11-04 23:11:02.346330 Epoch 59 Train Loss = 12.18654 Val Loss = 13.10835
2023-11-04 23:11:52.295897 Epoch 60 Train Loss = 12.18358 Val Loss = 13.10927
2023-11-04 23:12:42.079800 Epoch 61 Train Loss = 12.18059 Val Loss = 13.10720
2023-11-04 23:13:31.988412 Epoch 62 Train Loss = 12.18122 Val Loss = 13.09948
2023-11-04 23:14:21.942601 Epoch 63 Train Loss = 12.17826 Val Loss = 13.10516
2023-11-04 23:15:11.707376 Epoch 64 Train Loss = 12.17811 Val Loss = 13.11541
2023-11-04 23:16:01.396825 Epoch 65 Train Loss = 12.17559 Val Loss = 13.10633
2023-11-04 23:16:51.646142 Epoch 66 Train Loss = 12.16801 Val Loss = 13.10201
2023-11-04 23:17:41.273800 Epoch 67 Train Loss = 12.16677 Val Loss = 13.10674
2023-11-04 23:18:30.689707 Epoch 68 Train Loss = 12.16518 Val Loss = 13.10371
2023-11-04 23:19:20.528552 Epoch 69 Train Loss = 12.16560 Val Loss = 13.10503
2023-11-04 23:20:10.292778 Epoch 70 Train Loss = 12.16746 Val Loss = 13.10407
2023-11-04 23:21:00.422633 Epoch 71 Train Loss = 12.16584 Val Loss = 13.10132
2023-11-04 23:21:50.112689 Epoch 72 Train Loss = 12.16463 Val Loss = 13.10365
2023-11-04 23:22:39.796703 Epoch 73 Train Loss = 12.16579 Val Loss = 13.10187
2023-11-04 23:23:29.577102 Epoch 74 Train Loss = 12.16471 Val Loss = 13.10359
2023-11-04 23:24:19.745788 Epoch 75 Train Loss = 12.16498 Val Loss = 13.10360
2023-11-04 23:25:09.664569 Epoch 76 Train Loss = 12.16604 Val Loss = 13.10489
2023-11-04 23:25:59.906561 Epoch 77 Train Loss = 12.16392 Val Loss = 13.10149
2023-11-04 23:26:49.808301 Epoch 78 Train Loss = 12.16424 Val Loss = 13.10363
2023-11-04 23:27:39.593638 Epoch 79 Train Loss = 12.16330 Val Loss = 13.10135
Early stopping at epoch: 79
Best at epoch 49:
Train Loss = 12.20174
Train RMSE = 22.04260, MAE = 12.42602, MAPE = 8.16751
Val Loss = 13.09674
Val RMSE = 24.16307, MAE = 13.52780, MAPE = 10.06378
Saved Model: ../saved_models/STAEformer-PEMS08-2023-11-04-22-22-03.pt
--------- Test ---------
All Steps RMSE = 23.44588, MAE = 13.50675, MAPE = 8.85165
Step 1 RMSE = 19.59471, MAE = 11.77354, MAPE = 7.76780
Step 2 RMSE = 20.74049, MAE = 12.23849, MAPE = 8.03951
Step 3 RMSE = 21.64034, MAE = 12.63595, MAPE = 8.27516
Step 4 RMSE = 22.37883, MAE = 12.95883, MAPE = 8.47686
Step 5 RMSE = 22.96374, MAE = 13.23836, MAPE = 8.65528
Step 6 RMSE = 23.51227, MAE = 13.50210, MAPE = 8.82782
Step 7 RMSE = 23.97938, MAE = 13.74513, MAPE = 8.99198
Step 8 RMSE = 24.40706, MAE = 13.97189, MAPE = 9.14366
Step 9 RMSE = 24.77964, MAE = 14.18248, MAPE = 9.28395
Step 10 RMSE = 25.12773, MAE = 14.38134, MAPE = 9.42707
Step 11 RMSE = 25.45713, MAE = 14.58785, MAPE = 9.57068
Step 12 RMSE = 25.86324, MAE = 14.86522, MAPE = 9.76002
Inference time: 3.93 s
The MAE of PEMS08 in the paper is 13.46, which is the average of the 12 steps or the value of the 12th step?
... We have clearified this in our paper. Please refer to Section 4.1 at the bottom right corner of page 3.
Following previous work, we select the average performance of all predicted 12 horizons on the PEMS04, PEMS07 and PEMS08 datasets.
Is there a reason why it takes me so long to run
D:\Anacanda\Anaconda3\python.exe C:\Users\stop\Desktop\STAEformer-main\model\train.py
PEMS08
Trainset: x-(10700, 12, 170, 3) y-(10700, 12, 170, 1)
Valset: x-(3567, 12, 170, 3) y-(3567, 12, 170, 1)
Testset: x-(3566, 12, 170, 3) y-(3566, 12, 170, 1)
2023-11-07 19:42:24.553347: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE SSE2 SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
--------- STAEformer ---------
{
"num_nodes": 170,
"in_steps": 12,
"out_steps": 12,
"train_size": 0.6,
"val_size": 0.2,
"time_of_day": true,
"day_of_week": true,
"lr": 0.001,
"weight_decay": 0.0015,
"milestones": [
25,
45,
65
],
"lr_decay_rate": 0.1,
"batch_size": 16,
"max_epochs": 300,
"early_stop": 30,
"use_cl": false,
"cl_step_size": 2500,
"model_args": {
"num_nodes": 170,
"in_steps": 12,
"out_steps": 12,
"steps_per_day": 288,
"input_dim": 3,
"output_dim": 1,
"input_embedding_dim": 24,
"tod_embedding_dim": 24,
"dow_embedding_dim": 24,
"spatial_embedding_dim": 0,
"adaptive_embedding_dim": 80,
"feed_forward_dim": 256,
"num_heads": 4,
"num_layers": 3,
"dropout": 0.1
}
}
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
STAEformer [16, 12, 170, 1] 163,200
├─Linear: 1-1 [16, 12, 170, 24] 96
├─Embedding: 1-2 [16, 12, 170, 24] 6,912
├─Embedding: 1-3 [16, 12, 170, 24] 168
├─ModuleList: 1-4 -- --
│ └─SelfAttentionLayer: 2-1 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-1 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-2 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-3 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-4 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-5 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-6 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-2 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-7 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-8 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-9 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-10 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-11 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-12 [16, 170, 12, 152] 304
│ └─SelfAttentionLayer: 2-3 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-13 [16, 170, 12, 152] 93,024
│ │ └─Dropout: 3-14 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-15 [16, 170, 12, 152] 304
│ │ └─Sequential: 3-16 [16, 170, 12, 152] 78,232
│ │ └─Dropout: 3-17 [16, 170, 12, 152] --
│ │ └─LayerNorm: 3-18 [16, 170, 12, 152] 304
├─ModuleList: 1-5 -- --
│ └─SelfAttentionLayer: 2-4 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-19 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-20 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-21 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-22 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-23 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-24 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-5 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-25 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-26 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-27 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-28 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-29 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-30 [16, 12, 170, 152] 304
│ └─SelfAttentionLayer: 2-6 [16, 12, 170, 152] --
│ │ └─AttentionLayer: 3-31 [16, 12, 170, 152] 93,024
│ │ └─Dropout: 3-32 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-33 [16, 12, 170, 152] 304
│ │ └─Sequential: 3-34 [16, 12, 170, 152] 78,232
│ │ └─Dropout: 3-35 [16, 12, 170, 152] --
│ │ └─LayerNorm: 3-36 [16, 12, 170, 152] 304
├─Linear: 1-6 [16, 170, 12] 21,900
==========================================================================================
Total params: 1,223,460
Trainable params: 1,223,460
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 16.96
==========================================================================================
Input size (MB): 0.39
Forward/backward pass size (MB): 2087.13
Params size (MB): 4.24
Estimated Total Size (MB): 2091.76
==========================================================================================
Loss: HuberLoss
2023-11-07 19:44:50.934250 Epoch 1 Train Loss = 26.94686 Val Loss = 20.15100
2023-11-07 19:47:13.924053 Epoch 2 Train Loss = 19.00720 Val Loss = 18.99558
2023-11-07 19:49:37.256174 Epoch 3 Train Loss = 17.57253 Val Loss = 17.37119
2023-11-07 19:52:00.527695 Epoch 4 Train Loss = 16.87148 Val Loss = 15.80751
2023-11-07 19:54:23.728859 Epoch 5 Train Loss = 16.22930 Val Loss = 16.55234
2023-11-07 19:56:46.973601 Epoch 6 Train Loss = 16.10923 Val Loss = 16.96667
2023-11-07 19:59:10.182352 Epoch 7 Train Loss = 15.55004 Val Loss = 15.14247
2023-11-07 20:01:33.363080 Epoch 8 Train Loss = 15.32233 Val Loss = 15.27317
2023-11-07 20:03:56.583921 Epoch 9 Train Loss = 15.01539 Val Loss = 15.33415
2023-11-07 20:06:19.896107 Epoch 10 Train Loss = 14.94130 Val Loss = 15.91972
2023-11-07 20:08:43.068435 Epoch 11 Train Loss = 14.73261 Val Loss = 14.83506
2023-11-07 20:11:06.264286 Epoch 12 Train Loss = 14.55152 Val Loss = 14.67267
2023-11-07 20:13:29.412744 Epoch 13 Train Loss = 14.36773 Val Loss = 14.50170
2023-11-07 20:15:52.541736 Epoch 14 Train Loss = 14.26014 Val Loss = 14.22951
2023-11-07 20:18:15.699267 Epoch 15 Train Loss = 14.03027 Val Loss = 14.16464
2023-11-07 20:20:38.863330 Epoch 16 Train Loss = 13.97134 Val Loss = 14.37905
2023-11-07 20:23:02.057725 Epoch 17 Train Loss = 13.87778 Val Loss = 13.99436
2023-11-07 20:25:25.230582 Epoch 18 Train Loss = 13.74735 Val Loss = 14.07935
2023-11-07 20:27:48.422176 Epoch 19 Train Loss = 13.71970 Val Loss = 13.90268
2023-11-07 20:30:11.801672 Epoch 20 Train Loss = 13.55550 Val Loss = 14.00144
2023-11-07 20:32:35.052187 Epoch 21 Train Loss = 13.48250 Val Loss = 13.83573
2023-11-07 20:34:58.199632 Epoch 22 Train Loss = 13.48992 Val Loss = 13.75479
2023-11-07 20:37:21.373810 Epoch 23 Train Loss = 13.44191 Val Loss = 13.80880
2023-11-07 20:39:44.551084 Epoch 24 Train Loss = 13.29604 Val Loss = 13.75014
2023-11-07 20:42:07.800724 Epoch 25 Train Loss = 13.27085 Val Loss = 13.63319
2023-11-07 20:44:31.154076 Epoch 26 Train Loss = 12.65985 Val Loss = 13.13464
2023-11-07 20:46:54.475325 Epoch 27 Train Loss = 12.57553 Val Loss = 13.13647
2023-11-07 20:49:18.016899 Epoch 28 Train Loss = 12.54122 Val Loss = 13.19084
2023-11-07 20:51:43.751686 Epoch 29 Train Loss = 12.52014 Val Loss = 13.13270
2023-11-07 20:54:07.310270 Epoch 30 Train Loss = 12.50338 Val Loss = 13.11538
2023-11-07 20:56:30.972260 Epoch 31 Train Loss = 12.48552 Val Loss = 13.14225
2023-11-07 20:58:54.438475 Epoch 32 Train Loss = 12.46436 Val Loss = 13.11734
2023-11-07 21:01:17.816878 Epoch 33 Train Loss = 12.45413 Val Loss = 13.09134
2023-11-07 21:03:41.374631 Epoch 34 Train Loss = 12.43830 Val Loss = 13.10893
2023-11-07 21:06:05.032478 Epoch 35 Train Loss = 12.42598 Val Loss = 13.09105
2023-11-07 21:08:29.058138 Epoch 36 Train Loss = 12.41539 Val Loss = 13.09766
2023-11-07 21:10:52.994787 Epoch 37 Train Loss = 12.40023 Val Loss = 13.11258
2023-11-07 21:13:16.542625 Epoch 38 Train Loss = 12.39109 Val Loss = 13.07675
2023-11-07 21:15:40.031981 Epoch 39 Train Loss = 12.37925 Val Loss = 13.14383
2023-11-07 21:18:03.289777 Epoch 40 Train Loss = 12.37153 Val Loss = 13.06921
2023-11-07 21:20:26.859738 Epoch 41 Train Loss = 12.36109 Val Loss = 13.07015
2023-11-07 21:22:50.367579 Epoch 42 Train Loss = 12.34959 Val Loss = 13.14505
2023-11-07 21:25:13.944592 Epoch 43 Train Loss = 12.34492 Val Loss = 13.11190
2023-11-07 21:27:37.393295 Epoch 44 Train Loss = 12.33807 Val Loss = 13.07850
2023-11-07 21:30:00.805995 Epoch 45 Train Loss = 12.32557 Val Loss = 13.05335
2023-11-07 21:32:24.745653 Epoch 46 Train Loss = 12.25445 Val Loss = 13.04973
2023-11-07 21:34:48.942693 Epoch 47 Train Loss = 12.24596 Val Loss = 13.04086
2023-11-07 21:37:12.654701 Epoch 48 Train Loss = 12.24489 Val Loss = 13.03962
2023-11-07 21:39:36.023095 Epoch 49 Train Loss = 12.24442 Val Loss = 13.04484
2023-11-07 21:41:59.303410 Epoch 50 Train Loss = 12.24153 Val Loss = 13.03129
2023-11-07 21:44:22.648819 Epoch 51 Train Loss = 12.24090 Val Loss = 13.04687
2023-11-07 21:46:46.010189 Epoch 52 Train Loss = 12.23667 Val Loss = 13.04136
2023-11-07 21:49:09.471434 Epoch 53 Train Loss = 12.23634 Val Loss = 13.04207
2023-11-07 21:51:32.940506 Epoch 54 Train Loss = 12.23532 Val Loss = 13.05034
2023-11-07 21:53:56.417368 Epoch 55 Train Loss = 12.23171 Val Loss = 13.04834
2023-11-07 21:56:19.849841 Epoch 56 Train Loss = 12.22941 Val Loss = 13.05217
2023-11-07 21:58:43.337402 Epoch 57 Train Loss = 12.22974 Val Loss = 13.05176
2023-11-07 22:01:06.786807 Epoch 58 Train Loss = 12.22769 Val Loss = 13.04011
2023-11-07 22:03:30.244872 Epoch 59 Train Loss = 12.22611 Val Loss = 13.03589
2023-11-07 22:05:53.747256 Epoch 60 Train Loss = 12.22333 Val Loss = 13.04491
2023-11-07 22:08:17.182095 Epoch 61 Train Loss = 12.22326 Val Loss = 13.04879
2023-11-07 22:10:40.639031 Epoch 62 Train Loss = 12.22190 Val Loss = 13.02249
2023-11-07 22:13:04.152788 Epoch 63 Train Loss = 12.22080 Val Loss = 13.04877
2023-11-07 22:15:27.768911 Epoch 64 Train Loss = 12.21729 Val Loss = 13.04065
2023-11-07 22:17:51.241379 Epoch 65 Train Loss = 12.21695 Val Loss = 13.05228
2023-11-07 22:20:14.472846 Epoch 66 Train Loss = 12.21098 Val Loss = 13.03730
2023-11-07 22:22:37.934668 Epoch 67 Train Loss = 12.20878 Val Loss = 13.03684
2023-11-07 22:25:01.501469 Epoch 68 Train Loss = 12.20948 Val Loss = 13.03636
2023-11-07 22:27:25.056968 Epoch 69 Train Loss = 12.20989 Val Loss = 13.03668
2023-11-07 22:29:48.595842 Epoch 70 Train Loss = 12.20987 Val Loss = 13.04128
2023-11-07 22:32:12.219420 Epoch 71 Train Loss = 12.20807 Val Loss = 13.03974
2023-11-07 22:34:35.763714 Epoch 72 Train Loss = 12.20911 Val Loss = 13.03731
2023-11-07 22:36:59.416017 Epoch 73 Train Loss = 12.20716 Val Loss = 13.03944
2023-11-07 22:39:22.828690 Epoch 74 Train Loss = 12.20804 Val Loss = 13.03797
2023-11-07 22:41:46.414267 Epoch 75 Train Loss = 12.20666 Val Loss = 13.03749
2023-11-07 22:44:10.052228 Epoch 76 Train Loss = 12.20806 Val Loss = 13.03860
2023-11-07 22:46:33.754697 Epoch 77 Train Loss = 12.20615 Val Loss = 13.03749
2023-11-07 22:48:57.326507 Epoch 78 Train Loss = 12.20539 Val Loss = 13.03995
2023-11-07 22:51:20.744390 Epoch 79 Train Loss = 12.20705 Val Loss = 13.03697
2023-11-07 22:53:44.408878 Epoch 80 Train Loss = 12.20562 Val Loss = 13.03753
2023-11-07 22:56:07.975939 Epoch 81 Train Loss = 12.20694 Val Loss = 13.03644
2023-11-07 22:58:31.605106 Epoch 82 Train Loss = 12.20723 Val Loss = 13.03995
2023-11-07 23:00:55.152003 Epoch 83 Train Loss = 12.20596 Val Loss = 13.03932
2023-11-07 23:03:18.782330 Epoch 84 Train Loss = 12.20619 Val Loss = 13.04088
2023-11-07 23:05:42.323946 Epoch 85 Train Loss = 12.20489 Val Loss = 13.04158
2023-11-07 23:08:05.774943 Epoch 86 Train Loss = 12.20561 Val Loss = 13.03788
2023-11-07 23:10:29.427994 Epoch 87 Train Loss = 12.20638 Val Loss = 13.03655
2023-11-07 23:12:52.854312 Epoch 88 Train Loss = 12.20603 Val Loss = 13.04027
2023-11-07 23:15:16.367218 Epoch 89 Train Loss = 12.20474 Val Loss = 13.03880
2023-11-07 23:17:40.220728 Epoch 90 Train Loss = 12.20594 Val Loss = 13.03945
2023-11-07 23:20:03.733834 Epoch 91 Train Loss = 12.20375 Val Loss = 13.03750
2023-11-07 23:22:27.193084 Epoch 92 Train Loss = 12.20630 Val Loss = 13.03977
Early stopping at epoch: 92
Best at epoch 62:
Train Loss = 12.22190
Train RMSE = 22.11840, MAE = 12.47274, MAPE = 8.18564
Val Loss = 13.02249
Val RMSE = 24.07167, MAE = 13.46965, MAPE = 9.94274
Saved Model: ../saved_models/STAEformer-PEMS08-2023-11-07-19-42-18.pt
--------- Test ---------
All Steps RMSE = 23.27587, MAE = 13.41182, MAPE = 8.78660
Step 1 RMSE = 19.59908, MAE = 11.75883, MAPE = 7.74684
Step 2 RMSE = 20.67365, MAE = 12.20465, MAPE = 8.00521
Step 3 RMSE = 21.52860, MAE = 12.58538, MAPE = 8.22849
Step 4 RMSE = 22.23368, MAE = 12.89660, MAPE = 8.41676
Step 5 RMSE = 22.80104, MAE = 13.16269, MAPE = 8.58858
Step 6 RMSE = 23.30442, MAE = 13.40317, MAPE = 8.75613
Step 7 RMSE = 23.76362, MAE = 13.63729, MAPE = 8.90916
Step 8 RMSE = 24.17569, MAE = 13.84729, MAPE = 9.05252
Step 9 RMSE = 24.54942, MAE = 14.04440, MAPE = 9.20109
Step 10 RMSE = 24.90721, MAE = 14.23331, MAPE = 9.34760
Step 11 RMSE = 25.26548, MAE = 14.44469, MAPE = 9.49563
Step 12 RMSE = 25.65762, MAE = 14.72342, MAPE = 9.69132
Inference time: 13.87 s
进程已结束,退出代码为 0
I am sorry that I cannot give you an exact reason. It seems like your training process doesn't have any issues, the only drawback is that it runs slowly. There could be many potential reasons for a slow running speed, such as GPU performance, CPU performance, CPU/GPU usage, or other programs running concurrently impacting the training program. These could all contribute to a slowdown. Since I don't know anything about your computer, I can't definitively say what the cause is in your specific case.
The CPU of the computer is AMD Ryzen 7 5800H with Radeon Graphics, and the graphics card is NVIDIA GeForce RTX 3050 Ti Laptop GPU. Is there insufficient performance
Considering your GPU (3050ti laptop), I think this running speed is quite possibly normal. 3050ti laptop: 2560 CUDA cores, 4G Mem, 80w TDP. It is not very suitable for training complex models. We trained our model on RTX3090 (10496 cores, 24G mem, 350w TDP). There is a significant performance gap between our graphics cards.
Is it the hyperparameter given wrong?