Open DonceDace opened 1 year ago
以CB数据集在论文p-tuning中bert-base-cased上面报告的为例ACC---89.2 ,F1---92.1 然后论文提到这句话
MP zero-shot and MP fine-tuning report results of a single pattern, while anchors for P-tuning are selected from the same prompt.
是指MP-zero-shot 和 MP fine-tuning p-tuning都使用同一个pattern 进行报告结果吗?
然后运行代码后发现得到的结果是一个 平均值±标准差 ,因此fully supervised learning的实验中,论文上面报告的性能都是只取平均值吗?
翻阅代码后发现
# string_list_a = [text_a, ' question: ', text_b, ' true, false or neither? answer:', "the", self.mask] # string_list_a = [text_a, "[SEP]", example.text_b, "?", 'the', " answer: ", self.mask] # string_list_a = [text_a, "the", text_b, "?", "Answer:", self.mask] # string_list_a = [text_a, 'the the', 'question:', text_b, '?', 'the the', 'answer:', self.mask] # string_list_a = [text_a, "[SEP]", text_b, "?", "the", self.mask]
关于fully supervised learning 在p-tuning的实验是这5个patten得到的性能取平均进行报告,还是说每个pattern上面运行3次 然后取这5个pattern中性能最高的均值进行报告?
以CB数据集在论文p-tuning中bert-base-cased上面报告的为例ACC---89.2 ,F1---92.1 然后论文提到这句话
MP zero-shot and MP fine-tuning report results of a single pattern, while anchors for P-tuning are selected from the same prompt.
是指MP-zero-shot 和 MP fine-tuning p-tuning都使用同一个pattern 进行报告结果吗?
然后运行代码后发现得到的结果是一个 平均值±标准差 ,因此fully supervised learning的实验中,论文上面报告的性能都是只取平均值吗?
翻阅代码后发现
searched patterns in fully-supervised learning
关于fully supervised learning 在p-tuning的实验是这5个patten得到的性能取平均进行报告,还是说每个pattern上面运行3次 然后取这5个pattern中性能最高的均值进行报告?