LarsBentsen / FFTransformer

Multi-Step Spatio-Temporal Forecasting: https://authors.elsevier.com/sd/article/S0306-2619(22)01822-0
66 stars 16 forks source link

What`s the differences between type1, type2, type3 and type4 in tools.py #4

Closed sunxiaoyao-git closed 1 year ago

sunxiaoyao-git commented 1 year ago

I fine you write four types of adjust_learning_rate function, what`s the theory of each type?

LarsBentsen commented 1 year ago

Hi! Below is a brief explanation of each of the four settings. I experimented with all of them, but found that type1 without lr warmup worked OK for my application, decreasing the lr after every epoch. Note that the input epoch can be either interpreted as the training iteration (i.e. the number of updates performed) or the epoch number (if only updating after every epoch instead of iteration). I mainly updated after every epoch, rather than after every iteration, but this might differ for different applications and parts of the code in exp_main.py might have to be changed accordingly. Hope this helps and please re-open the issue if this did not answer your question!

sunxiaoyao-git commented 1 year ago

Thanks! pretty useful!

sunxiaoyao-git commented 1 year ago

Sorry, I maybe have another question about the type4 method. Under my understanding, total_num_iters should restart in each epoch not in each iteration. image

LarsBentsen commented 1 year ago

So the terminology might be somewhat misleading. The total_num_iter variable should keep track of the total number of training iterations performed so far, i.e. irrespective of the epoch number, and is only used for type4. This is because when you use this to update the learning rate for every training iteration using type4, you don't want to restart the learning rate schedule every epoch. I apologise for the slightly confusing terminology and implementation of this as I just made it to test a few different learning rate strategies. :)