Thanks for your amazing work!
I have a question about the type of GPT2. You have mentioned that you use gpt2 large as your langauge model (In section A.1), But I found your code actually load the GPT2 base model:
Hi, thanks for paying attention to that. It is a mistake in the paper, we used the base model.
Please let me know if you tried both and got better results.
Hi,
Thanks for your amazing work! I have a question about the type of GPT2. You have mentioned that you use gpt2 large as your langauge model (In section A.1), But I found your code actually load the GPT2 base model:
Is there any mistake?