Closed zyan97 closed 3 years ago
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/drive/MyDrive/Text_Generation/mega-clue-tok and are newly initialized: ['lm_head.weight']
lm_head层和嵌入层是共享层,没有独立的权重,所以有这个提示是正常的,没有问题。加载后你试过生成效果吗?还是生成有什么其他错误提示?
我是在colab中实验的,加载模型出现这个提示之后执行就被自动中断了。 2020-12-08 10:32:02.367037: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/drive/MyDrive/Text_Generation/mega-clue-tok and are newly initialized: ['lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. ^C
可能和colab环境有关,我不能翻墙用colab没办法调试,你可以用下面的代码调试看是具体哪里中断了:
from gpt2_ml_torch.generate import build_model
from transformers import pipeline
model, tokenizer, info = build_model('/content/drive/MyDrive/Text_Generation/mega-clue-tok')
# cpu: -1, gpu: 0
device = 0
nlp = pipeline('text-generation', model=model, tokenizer=tokenizer, device=device)
nlp(
'中国人',
num_return_sequences=1,
max_length=100,
do_sample=True,
return_dict=False
)
可能和colab环境有关,我不能翻墙用colab没办法调试,你可以用下面的代码调试看是具体哪里中断了:
from gpt2_ml_torch.generate import build_model from transformers import pipeline model, tokenizer, info = build_model('/content/drive/MyDrive/Text_Generation/mega-clue-tok') # cpu: -1, gpu: 0 device = 0 nlp = pipeline('text-generation', model=model, tokenizer=tokenizer, device=device) nlp( '中国人', num_return_sequences=1, max_length=100, do_sample=True, return_dict=False )
使用这个代码时,会出现如下信息: 2020-12-08 11:41:21.296594: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 781, in _legacy_load deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly) RuntimeError: read(): fd 4 failed with Transport endpoint is not connected
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py", line 856, in from_pretrained state_dict = torch.load(resolved_archive_file, map_location="cpu") File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 214, in exit self.file_like.close() OSError: [Errno 107] Transport endpoint is not connected
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/gpt2-ml-torch/run.py", line 5, in
应该是这个模型地址'/content/drive/MyDrive/Text_Generation/mega-clue-tok'不对或没有读取权限。
谢谢您的回复,之前没有注意到RAM大小不够,我换了个大一些的RAM就可以了。
现在错误信息如下,是需要生成前缀加什么吗? 谢谢
2020-12-08 12:49:53.931044: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/gpt2-ml-torch/models/mega-clue-tok and are newly initialized: ['lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Setting pad_token_id
to 50256 (first eos_token_id
) to generate sequence
用之前的generate.py可以生成,太感谢您了~
您好,我想加载模型尝试一下生成效果,但是会出现如下错误: 2020-12-08 03:39:34.031317: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/drive/MyDrive/Text_Generation/mega-clue-tok and are newly initialized: ['lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
transformers==3.1.0 使用的模型是mega-clue-tok sha256值也比对过是一致的。 麻烦您了