Converge trend of the model

cjw2021 / QAHOI

Apache License 2.0

47 stars 9 forks source link

Converge trend of the model #2

Closed hwfan closed 2 years ago

hwfan commented 2 years ago

Hi authors,

Thanks for your open-source implementation, I read your instruction and tried to reproduce the final detection performance. However I realized the converge speed of the model is too low: it takes almost 2 days to reach 150 epoches on two nodes with 8 gpus on each node. Have you tried any way to accelerate the procedure? Will scaling up the learning rate at the start of the training be helpful?

cjw2021 commented 2 years ago

Thanks for your interest in our work. We have tried 2x learning rate (the same setting as deformable DETR), but it didn't work. The heavy burden of predicting verb class may cause slow convergence.