[Question]: zero_shot_text_classification 零样本文本分类UTC，在预测环节出现了标签无法预测问题，但训练验证集上都是对的

dingidng commented 1 year ago

请提出你的问题

from pprint import pprint
from paddlenlp import Taskflow
schema = ["病情诊断", "治疗方案", "病因分析", "指标解读", "就医建议", "疾病表述", "后果表述", "注意事项", "功效作用", "医疗费用", "其他"]
# my_cls = Taskflow("zero_shot_text_classification", model="utc-base", schema=schema, task_path='/home/aistudio/checkpoint/model_best', precision="fp16")
my_cls = Taskflow("zero_shot_text_classification", model="utc-base", schema=schema, task_path='/home/aistudio/checkpoint/model_best')
pprint(my_cls("白癜风扩散的主要原因有哪些"))

[{'predictions': [{'label': '病情诊断', 'score': 0.5840373963936866},
                  {'label': '治疗方案', 'score': 0.5865470619983411},
                  {'label': '病因分析', 'score': 0.5843835913222092},
                  {'label': '指标解读', 'score': 0.5821808275276895},
                  {'label': '就医建议', 'score': 0.5865359246085544},
                  {'label': '疾病表述', 'score': 0.585401885318832},
                  {'label': '后果表述', 'score': 0.5869963863961639},
                  {'label': '注意事项', 'score': 0.5881809795301235},
                  {'label': '功效作用', 'score': 0.5805577597835053},
                  {'label': '医疗费用', 'score': 0.5814885387184837},
                  {'label': '其他', 'score': 0.5918049994795269}],
  'text_a': '白癜风扩散的主要原因有哪些'}]

[{'predictions': [{'label': '病情诊断', 'score': 0.5447408499106201},
                  {'label': '治疗方案', 'score': 0.5504865427793035},
                  {'label': '病因分析', 'score': 0.5413179173264995},
                  {'label': '指标解读', 'score': 0.5573936184368725},
                  {'label': '就医建议', 'score': 0.5477967212282746},
                  {'label': '疾病表述', 'score': 0.5419846449777911},
                  {'label': '后果表述', 'score': 0.5525397080801923},
                  {'label': '注意事项', 'score': 0.5354324520313458},
                  {'label': '功效作用', 'score': 0.5514831510741636},
                  {'label': '医疗费用', 'score': 0.5462238366276733},
                  {'label': '其他', 'score': 0.5445289292490739}],
  'text_a': '中性粒细胞比率偏低'}]

[2023-04-13 17:06:59,413] [ INFO] - test_loss = 1.6392 [2023-04-13 17:06:59,413] [ INFO] - test_macro_f1 = 0.8167 [2023-04-13 17:06:59,413] [ INFO] - test_micro_f1 = 0.9394 [2023-04-13 17:06:59,413] [ INFO] - test_runtime = 0:00:00.87 [2023-04-13 17:06:59,413] [ INFO] - test_samples_per_second = 6.835 [2023-04-13 17:06:59,413] [ INFO] - test_steps_per_second = 1.139

直接零样本模型直接调试结果：

[{'predictions': [{'label': '病因分析', 'score': 0.680140690380809}],
  'text_a': '白癜风扩散的主要原因有哪些'}]

[{'predictions': [{'label': '其他', 'score': 0.9360478829300829}],
  'text_a': '性粒细胞比率偏低'}]

这里看是正常，但是在预测环节就出现了问题，麻烦解答一下