Open MT010104 opened 2 years ago
Yes. "GPT-2" is NOT pre-trained on code. As for the different results, the hyper-parameters in this repo are used in fine-tuning CodeGPT. For GPT-2, you may try to finetune with more steps and select the best checkpoint with highest BLEU score in dev set. It's not always the latest checkpoint that performs the best in this task.
Thanks for your reply.
Hi! Thanks for the great work. Compared to the results of the model finetuned(60000 steps) by myself, the results above are much better. I wanna know the epochs of your finetuning and does "GPT-2" refers to "gpt2" on Huggingface which is not pretrained by code? Thanks in advance.