PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.09k stars 2.93k forks source link

[Question]: taskflow('information_extraction', schema=schema) 无输出 #6759

Open Roger-G opened 1 year ago

Roger-G commented 1 year ago

请提出你的问题

如题。实例直接输入无输出。>>> from pprint import pprint from paddlenlp import Taskflow -- paddlenlp 是 2.6.0


pip list 如下

aiofiles 23.1.0 aiohttp 3.8.5 aiosignal 1.3.1 altair 4.2.2 annotated-types 0.5.0 anyio 3.7.1 astor 0.8.1 asttokens 2.2.1 async-timeout 4.0.2 attrs 23.1.0 Babel 2.12.1 backcall 0.2.0 bce-python-sdk 0.8.87 blinker 1.6.2 cachetools 5.3.1 certifi 2023.5.7 charset-normalizer 3.2.0 click 8.1.6 colorama 0.4.6 colorlog 6.7.0 comm 0.1.3 contourpy 1.1.0 cycler 0.11.0 datasets 2.13.1 debugpy 1.6.7 decorator 5.1.1 dill 0.3.4 easydict 1.10 entrypoints 0.4 exceptiongroup 1.1.2 executing 1.2.0 fastapi 0.100.0 ffmpy 0.3.1 filelock 3.12.2 Flask 2.3.2 Flask-Babel 2.0.0 fonttools 4.41.0 frozenlist 1.4.0 fsspec 2023.6.0 future 0.18.3 gitdb 4.0.10 GitPython 3.1.32 gradio 3.39.0 gradio_client 0.3.0 gunicorn 21.2.0 h11 0.14.0 httpcore 0.17.3 httpx 0.24.1 huggingface-hub 0.16.4 idna 3.4 importlib-metadata 6.8.0 ipykernel 6.24.0 ipython 8.14.0 itsdangerous 2.1.2 jedi 0.18.2 jieba 0.42.1 Jinja2 3.1.2 joblib 1.3.1 jsonschema 4.18.4 jsonschema-specifications 2023.7.1 jupyter_client 8.3.0 jupyter_core 5.3.1 kiwisolver 1.4.4 linkify-it-py 2.0.2 markdown-it-py 2.2.0 MarkupSafe 2.1.3 matplotlib 3.7.2 matplotlib-inline 0.1.6 mdit-py-plugins 0.3.3 mdurl 0.1.1 multidict 6.0.4 multiprocess 0.70.12.2 nest-asyncio 1.5.6 numpy 1.25.1 onnx 1.14.0 opencv-python 4.8.0.74 opt-einsum 3.3.0 orjson 3.9.2 packaging 23.1 paddle-bfloat 0.1.7 paddle2onnx 1.0.6 paddlefsl 1.1.0 paddlehub 2.3.1 paddlenlp 2.6.0 paddlepaddle 2.5.0 pandas 2.0.3 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 10.0.0 pip 23.1.2 platformdirs 3.9.1 prompt-toolkit 3.0.39 protobuf 3.20.3 psutil 5.9.5 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 12.0.1 pycryptodome 3.18.0 pydantic 2.0.3 pydantic_core 2.3.0 pydeck 0.8.1b0 pydub 0.25.1 Pygments 2.15.1 Pympler 1.0.1 pyparsing 3.0.9 pypinyin 0.49.0 python-dateutil 2.8.2 python-multipart 0.0.6 pytz 2023.3 PyYAML 6.0.1 pyzmq 25.1.0 rarfile 4.0 referencing 0.30.0 requests 2.31.0 rich 13.4.2 rpds-py 0.9.2 safetensors 0.3.2 scikit-learn 1.3.0 scipy 1.11.1 semantic-version 2.10.0 semver 3.0.1 sentencepiece 0.1.99 seqeval 1.2.2 setuptools 67.8.0 six 1.16.0 smmap 5.0.0 sniffio 1.3.0 stack-data 0.6.2 starlette 0.27.0 streamlit 1.13.0 streamlit-image-comparison 0.0.4 threadpoolctl 3.2.0 toml 0.10.2 toolz 0.12.0 tornado 6.3.2 tqdm 4.65.0 traitlets 5.9.0 typer 0.9.0 typing_extensions 4.7.1 tzdata 2023.3 tzlocal 5.0.1 uc-micro-py 1.0.2 urllib3 2.0.4 uvicorn 0.23.1 validators 0.20.0 visualdl 2.4.2 watchdog 3.0.0 wcwidth 0.2.6 websockets 11.0.3 Werkzeug 2.3.6 wheel 0.38.4 xxhash 3.2.0 yarl 1.9.2 zipp 3.16.2

schema = ['时间', '选手', '赛事名称'] # Define the schema for entity extraction ie = Taskflow('information_extraction', schema=schema) pprint(ie("2月8日上午北京冬奥会自由式滑雪女子大跳台决赛中中国选手谷爱凌以188.25分获得金牌!"))

这里无输出

twosnowman commented 1 year ago

Same problem. The old version I tried about half year ago, it worked. But now, it's not.

VelChen commented 1 year ago

paddlenlp 2.6.0 paddlepaddle 2.5.1 Python 3.7.12

>>> schema = ['肿瘤的大小', '肿瘤的个数', '肝癌级别', '脉管内癌栓分级']
>>> ie.set_schema(schema)
>>> pprint(ie("(右肝肿瘤)肝细胞性肝癌(II-III级,梁索型和假腺管型),肿瘤包膜不完整,紧邻肝被膜,侵及周围肝组织,未见脉管内癌栓(MVI分级:M0 级)及卫星子灶形成。(肿物1个,大小4.2×4.0×2.8cm)。"))
[{}]
msute commented 1 year ago

paddlenlp 2.6.0 paddlepaddle 2.5.1 Python 3.7.12

>>> schema = ['肿瘤的大小', '肿瘤的个数', '肝癌级别', '脉管内癌栓分级']
>>> ie.set_schema(schema)
>>> pprint(ie("(右肝肿瘤)肝细胞性肝癌(II-III级,梁索型和假腺管型),肿瘤包膜不完整,紧邻肝被膜,侵及周围肝组织,未见脉管内癌栓(MVI分级:M0 级)及卫星子灶形成。(肿物1个,大小4.2×4.0×2.8cm)。"))
[{}]

他的数据集要自己训练的,你输其他标签没用,模型只能用他的那几个例子