Is it appropriate to set initial_lr to 1 ?

LBH1024 / CAN

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster).

MIT License

364 stars 59 forks source link

Is it appropriate to set initial_lr to 1 ? #4

Closed Note-Liu closed 2 years ago

Note-Liu commented 2 years ago

in config.yaml:

optimizer: Adadelta lr: 1 lr_decay: cosine step_ratio: 10 step_decay: 5 eps: 1e-6 weight_decay: 1e-4 beta: 0.9

in training,py: new_lr = 0.5 (1 + math.cos((current_step + 1 + (current_epoch - 1) steps) math.pi / (200 steps))) * initial_lr

Did you set set initial_lr to 1 ? On my own data set, the "eval_ExpRate" fluctuates greatly.

LBH1024 commented 2 years ago

in config.yaml:

optimizer: Adadelta lr: 1 lr_decay: cosine step_ratio: 10 step_decay: 5 eps: 1e-6 weight_decay: 1e-4 beta: 0.9

in training,py: new_lr = 0.5 (1 + math.cos((current_step + 1 + (current_epoch - 1) steps) math.pi / (200 steps))) * initial_lr

Did you set set initial_lr to 1 ? On my own data set, the "eval_ExpRate" fluctuates greatly.

Yes, the initial_lr is set to 1, since by default the lr of Adadelta optimizer is set to 1. For CROHME and HME100K, this setting can reach the best results in our experiments. If it doesn't perform well on your own dataset, consider using a smaller lr or changing the optimizer(e.g., Adam).

Note-Liu commented 2 years ago

in config.yaml: optimizer: Adadelta lr: 1 lr_decay: cosine step_ratio: 10 step_decay: 5 eps: 1e-6 weight_decay: 1e-4 beta: 0.9 in training,py: new_lr = 0.5 (1 + math.cos((current_step + 1 + (current_epoch - 1) steps) math.pi / (200 steps))) * initial_lr Did you set set initial_lr to 1 ? On my own data set, the "eval_ExpRate" fluctuates greatly.

Yes, the initial_lr is set to 1, since by default the lr of Adadelta optimizer is set to 1. For CROHME and HME100K, this setting can reach the best results in our experiments. If it doesn't perform well on your own dataset, consider using a smaller lr or changing the optimizer(e.g., Adam).

Thx.