PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.05k stars 5.54k forks source link

from_pretrained报错unsupported operand type(s) for +: 'PosixPath' and 'str' #38028

Closed Memelank closed 1 year ago

Memelank commented 2 years ago

为使您的问题得到快速解决,在建立Issue前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】

建立issue时,为快速解决问题,请您根据使用情况给出如下信息:

您好!我在试图通过以下代码读取ACL2021-PAIR模型时,遇到了以下报错,是由于该模型是在1.8.2版本下的paddlepaddle保存的引起的吗?我可以如何简单地加载该模型进行tokenize和inference的测试呢?

query = 'what is paula deen\'s brother'

import numpy as np
import paddle.fluid.dygraph as D
from ernie.tokenizing_ernie import ErnieTokenizer
from ernie.modeling_ernie import ErnieModel

D.guard().__enter__() # activate paddle `dygrpah` mode

model = ErnieModel.from_pretrained('/home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder')    # Try to get pretrained model from server, make sure you have network connection
# model = ErnieModel.from_pretrained('/home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/ernie_base_twin_init/params')    # Try to get pretrained model from server, make sure you have network connection
model.eval()
tokenizer = ErnieTokenizer.from_pretrained('/home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder')

ids, _ = tokenizer.encode(query)
print(ids)
ids = D.to_variable(np.expand_dims(ids, 0))  # insert extra `batch` dimension
pooled, encoded = model(ids)                 # eager execution
print(pooled.numpy())                        # convert  results to numpy

(这是我的文件结构,原来的checkpoint中只有保存的模型文件,我自行添加了对应的config和vocab并修改了路径) image 这是我遇到的报错

[INFO] 2021-12-10 11:06:42,390 [modeling_ernie.py:  271]:    pretrain dir /home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder not in {'ernie-1.0': 'https://ernie-github.cdn.bcebos.com/model-ernie1.0.1.tar.gz', 'ernie-2.0-en': 'https://ernie-github.cdn.bcebos.com/model-ernie2.0-en.1.tar.gz', 'ernie-2.0-large-en': 'https://ernie-github.cdn.bcebos.com/model-ernie2.0-large-en.1.tar.gz', 'ernie-tiny': 'https://ernie-github.cdn.bcebos.com/model-ernie_tiny.1.tar.gz', 'ernie-gram-zh': 'https://ernie-github.cdn.bcebos.com/model-ernie-gram-zh.1.tar.gz', 'ernie-gram-en': 'https://ernie-github.cdn.bcebos.com/model-ernie-gram-en.1.tar.gz'}, read from local
W1210 11:06:42.392050  8166 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.2, Runtime API Version: 11.2
W1210 11:06:42.396075  8166 device_context.cc:465] device: 0, cuDNN Version: 8.1.
[INFO] 2021-12-10 11:06:46,677 [modeling_ernie.py:  285]:    loading pretrained model from /home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder
Traceback (most recent call last):
  File "check.py", line 43, in <module>
    model = ErnieModel.from_pretrained('/home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder')    # Try to get pretrained model from server, make sure you have network connection
  File "/home1/lj/anaconda3/envs/pad/lib/python3.7/site-packages/ernie/modeling_ernie.py", line 293, in from_pretrained
    m = P.load(state_dict_path)
  File "/home1/lj/anaconda3/envs/pad/lib/python3.7/site-packages/paddle/framework/io.py", line 985, in load
    load_result = _legacy_load(path, **configs)
  File "/home1/lj/anaconda3/envs/pad/lib/python3.7/site-packages/paddle/framework/io.py", line 1003, in _legacy_load
    model_path, config = _build_load_path_and_config(path, config)
  File "/home1/lj/anaconda3/envs/pad/lib/python3.7/site-packages/paddle/framework/io.py", line 141, in _build_load_path_and_config
    prefix_format_path = path + INFER_MODEL_SUFFIX
TypeError: unsupported operand type(s) for +: 'PosixPath' and 'str'
paddle-bot-old[bot] commented 2 years ago

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

qili93 commented 2 years ago

@Memelank 请问您这里的模型代码使用的是PaddleNLP的代码吗?

造成以上问题的原因可能有很多,可以尝试定位一下:

  1. ErnieModel.from_pretrained 这里的输入可以试试到文件名为止,比如 /home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder/save_weights 后面不要跟.params (这是Paddle 2.x 的用法,1.x的用法是输入到目录地址为止),具体还是需要看 ErnieModel.from_pretrained 这里源码的接口是怎么定义的

  2. Paddle 1.x 和 2.x 保存的模型格式之间的确存在不兼容性,建议最好用 1.x 的版本来跑 1.x保存下来的模型,这个暂时没有办法解决,看是不是能找个2.x版本的模型格式吧。

Memelank commented 2 years ago

@Memelank 请问您这里的模型代码使用的是PaddleNLP的代码吗?

造成以上问题的原因可能有很多,可以尝试定位一下:

  1. ErnieModel.from_pretrained 这里的输入可以试试到文件名为止,比如 /home1/lj/workspace/dprProject/Research/NLP/ACL2021-PAIR/checkpoint/marco_test_encoder/save_weights 后面不要跟.params (这是Paddle 2.x 的用法,1.x的用法是输入到目录地址为止),具体还是需要看 ErnieModel.from_pretrained 这里源码的接口是怎么定义的
  2. Paddle 1.x 和 2.x 保存的模型格式之间的确存在不兼容性,建议最好用 1.x 的版本来跑 1.x保存下来的模型,这个暂时没有办法解决,看是不是能找个2.x版本的模型格式吧。

谢谢您,我降低paddle版本试试好了