PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.96k stars 2.91k forks source link

[Question]: Can't load the model for 'gpt-cpm-small-cn-distill' #8230

Closed lao-xu closed 1 week ago

lao-xu commented 5 months ago

请提出你的问题

from paddlenlp.transformers import GPTChineseTokenizer, GPTLMHeadModel

model_name = "gpt-cpm-small-cn-distill"

tokenizer = GPTChineseTokenizer.from_pretrained(model_name) model = GPTLMHeadModel.from_pretrained(model_name)

why say "OSError: Can't load the model for 'gpt-cpm-small-cn-distill'. If you were trying to load it from 'https://paddlenlp.bj.bcebos.com/'"?

w5688414 commented 5 months ago

已修复,或者回退到2.6版本试一下。 https://github.com/PaddlePaddle/PaddleNLP/pull/8253

lao-xu commented 5 months ago

已修复,或者回退到2.6版本试一下。 #8253

还是不行吧。 from paddlenlp.transformers import GPTChineseTokenizer, GPTLMHeadModel

model_name = "gpt-cpm-small-cn-distill"

tokenizer = GPTChineseTokenizer.from_pretrained(model_name) model = GPTLMHeadModel.from_pretrained(model_name) model.eval()

inputs = "花间一壶酒,独酌无相亲。举杯邀明月," inputs_ids = tokenizer(inputs)["input_ids"] inputs_ids = paddle.to_tensor(inputs_ids, dtype="int64").unsqueeze(0)

outputs, _ = model.generate(input_ids=inputs_ids, max_length=10, decode_strategy="greedy_search", use_fast=True)

result = tokenizer.convert_ids_to_string(outputs[0].numpy().tolist())

print("Model input:", inputs) print("Result:", result)


Traceback (most recent call last) File /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py:1581, in PretrainedModel._resolve_model_file_path(cls, pretrained_model_name_or_path, from_hf_hub, from_aistudio, cache_dir, subfolder, config, convert_from_torch, use_safetensors, variant) 1579 if pretrained_model_name_or_path in cls.pretrained_init_configuration: 1580 # fetch the weight url from the pretrained_resource_files_map -> 1581 resource_file_url = cls.pretrained_resource_files_map["model_state"][ 1582 pretrained_model_name_or_path 1583 ] 1584 resolved_archive_file = cached_file( 1585 resource_file_url, _add_variant(PADDLE_WEIGHTS_NAME, variant), **cached_file_kwargs 1586 )

KeyError: 'model_state'

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last) Cell In[3], line 6 3 model_name = "gpt-cpm-small-cn-distill" 5 tokenizer = GPTChineseTokenizer.from_pretrained(model_name) ----> 6 model = GPTLMHeadModel.from_pretrained(model_name) 7 model.eval() 9 inputs = "花间一壶酒,独酌无相亲。举杯邀明月,"

File /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py:2116, in PretrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *args, **kwargs) 2113 use_keep_in_fp32_modules = False 2115 # resolve model_weight file -> 2116 resolved_archive_file, sharded_metadata, is_sharded = cls._resolve_model_file_path( 2117 pretrained_model_name_or_path, 2118 cache_dir=cache_dir, 2119 subfolder=subfolder, 2120 from_hf_hub=from_hf_hub, 2121 from_aistudio=from_aistudio, 2122 config=config, 2123 convert_from_torch=convert_from_torch, 2124 use_safetensors=use_safetensors, 2125 variant=variant, 2126 ) 2128 # load pt weights early so that we know which dtype to init the model under 2129 if not is_sharded and state_dict is None: 2130 # Time to load the checkpoint

File /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py:1638, in PretrainedModel._resolve_model_file_path(cls, pretrained_model_name_or_path, from_hf_hub, from_aistudio, cache_dir, subfolder, config, convert_from_torch, use_safetensors, variant) 1636 logger.info(e) 1637 # For any other exception, we throw a generic error. -> 1638 raise EnvironmentError( 1639 f"Can't load the model for '{pretrained_model_name_or_path}'. If you were trying to load it" 1640 " from 'https://paddlenlp.bj.bcebos.com/'" 1641 ) 1643 if is_local: 1644 logger.info(f"Loading weights file {archive_file}")

OSError: Can't load the model for 'gpt-cpm-small-cn-distill'. If you were trying to load it from 'https://paddlenlp.bj.bcebos.com/'

w5688414 commented 5 months ago

我测了几次都没有问题:

image
lao-xu commented 5 months ago

我在aistduio上跑的,一直报这个错,这云环境有关系吗?

w5688414 commented 5 months ago

查看一下你的paddle版本,可以装develop版本试一下

yangquanbiubiu commented 5 months ago

我是在昆仑芯R200上跑的,也碰到了同样的问题,请问你解决了吗

lao-xu commented 5 months ago

我是在昆仑芯R200上跑的,也碰到了同样的问题,请问你解决了吗 没有,我升级了paddlenlp版本还是不行,不知道为什么

w5688414 commented 5 months ago

按照如下的命令试一下:

pip uninstall paddlenlp
git clone https://github.com/PaddlePaddle/PaddleNLP.git
cd PaddleNLP
pip install -e .
github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 1 week ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。