关于UIE fintune 之后效果不如zero-shot的问题

PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

https://paddlenlp.readthedocs.io

Apache License 2.0

12.16k stars 2.94k forks source link

关于UIE fintune 之后效果不如zero-shot的问题 #2906

Closed wireless911 closed 1 year ago

wireless911 commented 2 years ago

我在医疗实体关系抽取任务中采用了开源数据集对UIE进行了测试（我的评价指标采用的是对spo三元组计算P，R，F1）：

1. TaskFlow zero-shot  F1：0.2
2. few-shot  F1：0.19

fintune 之后结果反而更差一点儿，训练阶段在dev开发集（SpanEvaluator）最好的F1值为0.547 ，麻烦帮我看下

版本、环境信息 paddlenlp：2.3.4 paddle：2.3.0 环境：aistudio

linjieccc commented 2 years ago

@wireless911 您好，能否提供下AI Studio的项目地址并设置为公开

wireless911 commented 2 years ago

感谢回复，我的aistudio 的地址：https://aistudio.baidu.com/aistudio/projectdetail/4374394?shared=1

linjieccc commented 2 years ago

似乎没有数据转换和训练部分的代码，请问您的训练/验证集是通过doccano.py进行构造的么

wireless911 commented 2 years ago

似乎没有数据转换和训练部分的代码，请问您的训练/验证集是通过doccano.py进行构造的么

我有一个finetune.ipynb 的文件，你只能看到main.ipynb吗

linjieccc commented 2 years ago

请问目前训练集正负样本的比例是多少，从预测结果来看错误召回比较多，可以适当增加负样本的比例

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动，被标记为stale。

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天，即将关闭。