模型加载 - Githubissues

zyan97 commented 3 years ago

您好，我想加载模型尝试一下生成效果，但是会出现如下错误： 2020-12-08 03:39:34.031317: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/drive/MyDrive/Text_Generation/mega-clue-tok and are newly initialized: ['lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

transformers==3.1.0 使用的模型是mega-clue-tok sha256值也比对过是一致的。麻烦您了

ghosthamlet commented 3 years ago

Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/drive/MyDrive/Text_Generation/mega-clue-tok and are newly initialized: ['lm_head.weight'] lm_head层和嵌入层是共享层，没有独立的权重，所以有这个提示是正常的，没有问题。加载后你试过生成效果吗？还是生成有什么其他错误提示？

zyan97 commented 3 years ago

我是在colab中实验的，加载模型出现这个提示之后执行就被自动中断了。 2020-12-08 10:32:02.367037: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/drive/MyDrive/Text_Generation/mega-clue-tok and are newly initialized: ['lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. ^C

ghosthamlet commented 3 years ago

可能和colab环境有关，我不能翻墙用colab没办法调试，你可以用下面的代码调试看是具体哪里中断了：

from gpt2_ml_torch.generate import build_model

from transformers import pipeline

model, tokenizer, info = build_model('/content/drive/MyDrive/Text_Generation/mega-clue-tok')

# cpu: -1, gpu: 0
device = 0
nlp = pipeline('text-generation', model=model, tokenizer=tokenizer, device=device)
nlp(
    '中国人', 
    num_return_sequences=1, 
    max_length=100, 
    do_sample=True,
    return_dict=False
)

zyan97 commented 3 years ago

可能和colab环境有关，我不能翻墙用colab没办法调试，你可以用下面的代码调试看是具体哪里中断了：

from gpt2_ml_torch.generate import build_model

from transformers import pipeline

model, tokenizer, info = build_model('/content/drive/MyDrive/Text_Generation/mega-clue-tok')

# cpu: -1, gpu: 0
device = 0
nlp = pipeline('text-generation', model=model, tokenizer=tokenizer, device=device)
nlp(
    '中国人', 
    num_return_sequences=1, 
    max_length=100, 
    do_sample=True,
    return_dict=False
)

使用这个代码时，会出现如下信息： 2020-12-08 11:41:21.296594: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 781, in _legacy_load deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly) RuntimeError: read(): fd 4 failed with Transport endpoint is not connected

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py", line 856, in from_pretrained state_dict = torch.load(resolved_archive_file, map_location="cpu") File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 214, in exit self.file_like.close() OSError: [Errno 107] Transport endpoint is not connected

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/content/gpt2-ml-torch/run.py", line 5, in model, tokenizer, info = build_model('/content/drive/MyDrive/Text_Generation/mega-clue-tok') File "/content/gpt2-ml-torch/gpt2_ml_torch/generate.py", line 18, in build_model model, info = GPT2LMHeadModel.from_pretrained(model_path, output_loading_info=True) File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py", line 859, in from_pretrained "Unable to load weights from pytorch checkpoint file. " OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

ghosthamlet commented 3 years ago

应该是这个模型地址'/content/drive/MyDrive/Text_Generation/mega-clue-tok'不对或没有读取权限。

zyan97 commented 3 years ago

谢谢您的回复，之前没有注意到RAM大小不够，我换了个大一些的RAM就可以了。现在错误信息如下，是需要生成前缀加什么吗？谢谢 2020-12-08 12:49:53.931044: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /content/gpt2-ml-torch/models/mega-clue-tok and are newly initialized: ['lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Setting pad_token_id to 50256 (first eos_token_id) to generate sequence

zyan97 commented 3 years ago

用之前的generate.py可以生成，太感谢您了～

ghosthamlet / gpt2-ml-torch

模型加载 #9