PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.17k stars 2.95k forks source link

[Question]: taskflow预测uie-data-distill-gp预测失败 #4151

Closed GUSHUMING closed 1 year ago

GUSHUMING commented 1 year ago

请提出你的问题

报错信息

[2022-12-19 10:04:01,372] [    INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'C:\Users\16929\Desktop\data_distill\court\checkpoint\model_30000'.
Traceback (most recent call last):
  File "C:/Users/16929/Desktop/data_distill/predict_taskflow.py", line 4, in <module>
    print(gp("借款后,被告偿还了本金10000元,仍欠原告50000元,现原告诉请要求被告偿还借款本金50000元,依法有据,本院予以支持"))
  File "C:\Users\16929\anaconda3\envs\paddle\lib\site-packages\paddlenlp\taskflow\taskflow.py", line 552, in __call__
    results = self.task_instance(inputs)
  File "C:\Users\16929\anaconda3\envs\paddle\lib\site-packages\paddlenlp\taskflow\task.py", line 429, in __call__
    outputs = self._run_model(inputs)
  File "C:\Users\16929\anaconda3\envs\paddle\lib\site-packages\paddlenlp\taskflow\information_extraction.py", line 1283, in _run_model
    all_preds[0].extend(batch_outputs[0])  # Entity output
IndexError: list index out of range

Process finished with exit code 1

"

在预测框架中,可能是这边出问题了

        if isinstance(batch_outputs, tuple):
            print(batch_outputs)
            all_preds[0].extend(batch_outputs[0])  # Entity output
            all_preds[1].extend(batch_outputs[1])  # Relation output

paddlenlp版本为2.4.5 预测代码为:

schema = {"判决结果是什么的触发词": ["判决相关金额", "判决诉求是什么", "主动方身份是什么", "主动方是谁", "被动方身份是什么", "被动方是谁"]}
gp = Taskflow("information_extraction", model="uie-data-distill-gp", task_path=r"C:\Users\16929\Desktop\data_distill\court\checkpoint\model_30000",schema=schema)
print(gp("借款后,被告偿还了本金10000元,仍欠原告50000元,现原告诉请要求被告偿还借款本金50000元,依法有据,本院予以支持"))

model_config:

{
  "task_type": "event_extraction",
  "label_maps": {
    "entity2id": {
      "判决结论是什么的触发词": 0,
      "object": 1
    },
    "relation2id": {
      "判决相关金额": 0,
      "判决描述是什么": 1,
      "主动方身份是什么": 2,
      "主动方是谁": 3,
      "被动方身份是什么": 4,
      "被动方是谁": 5
    },
    "schema": [
      {
        "判决结论是什么的触发词": [
          "判决相关金额",
          "判决描述是什么",
          "主动方身份是什么",
          "主动方是谁",
          "被动方身份是什么",
          "被动方是谁"
        ]
      }
    ],
    "id2entity": {
      "0": "判决结论是什么的触发词",
      "1": "object"
    },
    "id2relation": {
      "0": "判决相关金额",
      "1": "判决描述是什么",
      "2": "主动方身份是什么",
      "3": "主动方是谁",
      "4": "被动方身份是什么",
      "5": "被动方是谁"
    }
  },
  "encoder": "ernie-3.0-mini-zh"
}

部分训练结果:

[2022-12-19 00:26:59,473] [    INFO] - global step 38550, epoch: 20, loss: 1.72814, speed: 7.86 step/s
[2022-12-19 00:27:05,608] [    INFO] - global step 38600, epoch: 20, loss: 1.72883, speed: 8.15 step/s
[2022-12-19 00:27:11,857] [    INFO] - global step 38650, epoch: 20, loss: 1.89926, speed: 8.00 step/s
[2022-12-19 00:27:18,121] [    INFO] - global step 38700, epoch: 20, loss: 1.74767, speed: 7.98 step/s
[2022-12-19 00:27:24,702] [    INFO] - global step 38750, epoch: 20, loss: 1.68783, speed: 7.60 step/s
[2022-12-19 00:27:30,996] [    INFO] - global step 38800, epoch: 20, loss: 1.65247, speed: 7.94 step/s
[2022-12-19 00:27:37,403] [    INFO] - global step 38850, epoch: 20, loss: 1.81134, speed: 7.80 step/s
[2022-12-19 00:27:43,599] [    INFO] - global step 38900, epoch: 20, loss: 1.70744, speed: 8.07 step/s
[2022-12-19 00:27:50,221] [    INFO] - global step 38950, epoch: 20, loss: 1.65114, speed: 7.55 step/s
[2022-12-19 00:27:56,788] [    INFO] - global step 39000, epoch: 20, loss: 1.66749, speed: 7.61 step/s
[2022-12-19 00:27:57,261] [    INFO] - Evaluation precision: {'entity_f1': 0.78768, 'entity_precision': 0.7629, 'entity_recall': 0.81411, 'relation_f1': 0.66667, 'relation_precision': 0.6582, 'relation_recall': 0.67536}
GUSHUMING commented 1 year ago

已经找问题,将模型中的 event_extraction 换成relation_extraction 后解决。 event_extraction 预测时不支持关系,但训练时好像支持,希望统一一下

GUSHUMING commented 1 year ago

模型中的 event_extraction 换成relation_extraction 后解决,event_extraction 不支持关系