Dear author:
I am glad to see your work and I found you say you use the SGD with lr=1e-3,
so I would like to know that do you use some common learning rate decay strategys and other details about fine-tuning?
I would appreciate it if you could reply soon!
Dear author: I am glad to see your work and I found you say you use the SGD with lr=1e-3, so I would like to know that do you use some common learning rate decay strategys and other details about fine-tuning? I would appreciate it if you could reply soon!
Best regards, Zhenyu Liu