seoungwugoh / STM

Video Object Segmentation using Space-Time Memory Networks
405 stars 81 forks source link

About Optimization #26

Closed IIAT-MR-LL closed 4 years ago

IIAT-MR-LL commented 4 years ago

Hi, I want to know some details of the configuration of Adam optimizer. In the paper, as you just mentioned use constant learning rate 1e^{-5}, but did not mention about the weight decay which is also important for optimization. Would you mind sharing with us the hyperparameter setting for Adam optimizer (i.e. weight_decay and betas).

Thanks

seoungwugoh commented 4 years ago

@IIAT-MR-LL In our follow-up experiments, reducing LR even with a rough schedule gives us more stable results. Consider to use LR scheduler. We did not make change in the weight decay and other parameters (used default hyper parameters from Adam optimizer except LRs).

IIAT-MR-LL commented 4 years ago

@seoungwugoh Thanks for your instruction.