Closed theoqian closed 2 years ago
@theoqian Hi,
I think this issue in P-tuning v2 repo asks the same problem as yours. It is because in P-tuning v2 we report a fixed backbone version of P-tuning v1 to follow the experimental setting of Lesters et al.
Thanks for your reply.
Hi, I find most of the SuperGLUE metrics of PT reported in P-Tuning paper are superior to metrics of fine-tuning. But the metrics of PT reported in P-TuningV2 paper are much worse than fine-tuning. For example in BoolQ tasks, in P-Tuning paper the acc is 72.9 for fine-tuning and 73.9 for PT. While in P-TuningV2 paper the acc is 77.7 for fine-tuning and 67.2 for PT.
It seems that from P-TuningV2 paper is much worse than fine-tuning which is opposite to the conclusion from P-Tuning paper.