PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.17k stars 2.95k forks source link

UIE模型输入数据格式校验 #3139

Closed MachineSheep closed 1 year ago

MachineSheep commented 2 years ago

建议UIE模型代码结构中加入数据格式校验脚本,因为大多数用UIE模型的数据来源都是已经打标的数据,需要自己手动编写数据格式转换脚本把原有的数据格式转换成模型输入的数据格式(或者转换成doccano.py文件输入的数据格式),然而上万条数据转换后,其中可能会因为自己编写的转换脚本不够全面,而导致少量的数据格式有问题,难以排查,从而导致模型训练报错。

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。