Hi Lisa,
Thanks a lot for conducting such helpful experiments on comparing prompt-based fine-tuning with prompt tuning. However, I have difficulty in reproducing your BLEU and ROUGE-L score shown in Figure 3 below. Could you please give me some clues on how to calculate BLEU in your code in lowdata setting? (I only saw perplexity)
I also re-implement your prefix prompt with activation using GPT-2(117M). However, it only achieves 15 BLEU score on WikiData. Could you please give me some insight in the lowdata prefix-tuning setting?
Hi Lisa, Thanks a lot for conducting such helpful experiments on comparing prompt-based fine-tuning with prompt tuning. However, I have difficulty in reproducing your BLEU and ROUGE-L score shown in Figure 3 below. Could you please give me some clues on how to calculate BLEU in your code in lowdata setting? (I only saw perplexity) I also re-implement your prefix prompt with activation using GPT-2(117M). However, it only achieves 15 BLEU score on WikiData. Could you please give me some insight in the lowdata prefix-tuning setting?
Thanks in advance!