polynomial scheduler & early stop patience info missing

uclanlp / DeepKPG

Deep Keyphrase Generation with Pre-trained Language Models

MIT License

21 stars 0 forks source link

polynomial scheduler & early stop patience info missing #2

Open kgarg8 opened 1 year ago

kgarg8 commented 1 year ago

Hi,

For finetuning Bart-base with kp20k, could you give away all the parameters for the polynomial decay scheduler you used, specifically, power, num_training_steps? I believe if you release the json file required here, that'd be even better.

Also, could you please let us know how many epochs did you finetune on kp20k? 15 epochs? But I assume you also assume early stopping. If yes, what was the patience value used for early stopping?

Thanks and keep up your inspiring work!

xiaowu0162 commented 1 year ago

Hi,

For the LR scheduler, please refer to this implementation: https://github.com/huggingface/transformers/blob/ef42c2c487260c2a0111fa9d17f2507d84ddedea/src/transformers/optimization.py#L235
For BART-base and KP20k, we usually fine-tune for 15 epochs and pick the checkpoint with the best validation F1.

kgarg8 commented 1 year ago

Thanks for the prompt reply.

I tried to plot the scheduler with power=1.0, and observed that it kind of defaults to linear decay. Anyways, I just wanted to confirm.

xiaowu0162 commented 1 year ago

Yes we just used that default linear decay.