PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12k stars 2.92k forks source link

[Bug]: 例子(信息抽取)出错 #8693

Closed yuwiggin closed 2 weeks ago

yuwiggin commented 3 months ago

软件环境

- paddlepaddle: 2.6.1
- paddlepaddle-gpu: 
- paddlenlp: 2.8.1
- paddle2onnx 1.2.4
- paddlefsl 1.1.0

重复问题

错误描述

/home/wiggin/miniconda3/envs/paddle_env/lib/python3.8/site-packages/_distutils_hack/__init__.py:26: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")
[2024-07-01 23:27:22,616] [ WARNING] - if you run ring_flash_attention.py, please ensure you install the paddlenlp_ops by following the instructions provided at https://github.com/PaddlePaddle/PaddleNLP/blob/develop/csrc/README.md
Traceback (most recent call last):
  File "extract.py", line 5, in <module>
    ie = Taskflow('information_extraction', schema=schema)
  File "/home/wiggin/PaddleNLP/paddlenlp/taskflow/taskflow.py", line 809, in __init__
    self.task_instance = task_class(
  File "/home/wiggin/PaddleNLP/paddlenlp/taskflow/information_extraction.py", line 538, in __init__
    self._construct_tokenizer()
  File "/home/wiggin/PaddleNLP/paddlenlp/taskflow/information_extraction.py", line 595, in _construct_tokenizer
    self._tokenizer = AutoTokenizer.from_pretrained(
  File "/home/wiggin/PaddleNLP/paddlenlp/transformers/auto/tokenizer.py", line 223, in from_pretrained
    raise ValueError("use_fast is deprecated")
ValueError: use_fast is deprecated

稳定复现步骤 & 代码

extract.py 如下:

# 信息抽取
from pprint import pprint
from paddlenlp import Taskflow 
schema = ['时间', '选手', '赛事名称'] # Define the schema for entity extraction 
ie = Taskflow('information_extraction', schema=schema) 
pprint(ie("2月8日上午北京冬奥会自由式滑雪女子大跳台决赛中中国选手谷爱凌以188.25分获得金牌!"))

运行出错如上所述。

wawltor commented 3 months ago

谢谢提出问题,我们进行了修复,请看 https://github.com/PaddlePaddle/PaddleNLP/commit/cf57f86755c02860aedc7c65b0a6d51b1d55dbf4

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。