Closed MaiRajborirug closed 7 months ago
Just comment that line. I would expect that supervised finetuning works if you start with fresh adam momentums. Similar than starting a new training with pre-trained image net weights (where you also don't load the optimizer weights).
Got it thank you so much!
Then what should be the learning rate? Should I keep learning rate 10e-4 or should I take into account that the model already trained for 41 epoch and reduce the start learning rate in the model?
Well its not possible to predict the appropriate learning rate for a task in advance. Start with the default in the repository and test different values.
As a followup from ISSUE 194 The given pre-train weights don't have
optimizer.pth
along with it. However, when we load weight states. We are supposed to load an optimizer as shown in train.py line 183. Since we didn't haveoptimizer.pth
. from the pre-trained folder What should I do to make the training go smoothly?