Closed pjspol closed 3 months ago
Thank you for your interest in our work! Apologies for the delayed response. The fine-tuning strategies are now available in the finetune.py script. You can activate them using the following flags: --use_flag, --use_llrd, --use_l2sp, and --use_reinit.
We did not record the optimal hyperparameter combinations for different datasets.
The paper that introduces KPGT states, "To fully take advantage of the abundant knowledge captured in the pre-training stage, KPGT introduces four finetuning strategies, including layer-wise learning rate decay (LLRD), re-initialization (ReInit), FLAG and, L2-SP."
Is the code for executing these fine-tuning strategies available as part of the KPGT GitHub?
Also, is there a table available specifying which hyperparameter combination was used for each dataset? I have only been able to find the list of hyperparameter combinations tested, but not the final per-dataset choices.
Any response is greatly appreciated!