PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.16k stars 2.94k forks source link

[Question]: UIE finetune Recall和F1始终在10%~20% #3403

Closed jack-gits closed 1 year ago

jack-gits commented 2 years ago

请提出你的问题

fine tune uie-base-en 模型, Recall和F1始终在10%~20%; 求大神指教。

image

wawltor commented 2 years ago

是否可以发一些训练数据看看?

jack-gits commented 2 years ago

{"content": " ZKSMLQPWTBDURXYNF NH658IUL2501JF4 VAUFEROKCSDQX HCSETRPYAFUDKM", "result_list": [{"text": "NH658IUL2501JF4", "start": 19, "end": 34}], "prompt": "vin"} {"content": " ZKSMLQPWTBDURXYNF VAUFEROKCSDQX HCSETRPYAFUDKM", "result_list": [], "prompt": "vin"} {"content": " NIGWPJKCSAYBL ULPDZHJTWANCXFKMV XUCIDKQWZMONYPT D1604MX4C0R72PT90", "result_list": [{"text": "D1604MX4C0R72PT90", "start": 49, "end": 66}], "prompt": "vin"} {"content": " NIGWPJKCSAYBL ULPDZHJTWANCXFKMV XUCIDKQWZMONYPT ", "result_list": [], "prompt": "vin"} {"content": " AOPWGHZYXNSLQJTE MHSFAJXWRCQUPVZTYG NZVWHPFCKEMLT TKWSNDUOYPHJXVLR 8H9D9X08937FBL1W", "result_list": [{"text": "8H9D9X08937FBL1W", "start": 68, "end": 84}], "prompt": "vin"} {"content": " AOPWGHZYXNSLQJTE MHSFAJXWRCQUPVZTYG NZVWHPFCKEMLT TKWSNDUOYPHJXVLR ", "result_list": [], "prompt": "vin"} {"content": " 26Q7FA019Y6M5CB8 VPSJFYGUKWENOZ OAJRQVPSULIHKGWM", "result_list": [{"text": "26Q7FA019Y6M5CB8", "start": 1, "end": 17}], "prompt": "vin"} {"content": " VPSJFYGUKWENOZ OAJRQVPSULIHKGWM", "result_list": [], "prompt": "vin"} {"content": " INDXUFJOYLMETCW 2FI681LC784N9A5", "result_list": [{"text": "2FI681LC784N9A5", "start": 17, "end": 32}], "prompt": "vin"} {"content": " INDXUFJOYLMETCW ", "result_list": [], "prompt": "vin"}

jack-gits commented 2 years ago

@wawltor 都是模拟生成的数据。

wawltor commented 2 years ago

看起来是数据是没有什么语义信息,如果是这份数据上做信息抽取看起来是很难做到很好的抽取效果;真实的数据也是这种吗?

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。