microsoft / CodeXGLUE

CodeXGLUE
MIT License
1.55k stars 366 forks source link

code-to-text reload model weights #83

Closed KevinHuuu closed 3 years ago

KevinHuuu commented 3 years ago

Hi there! In code-to-text experiment, after saved a model checkpoint and want to reload it, I got the following errors:

10/13/2021 06:35:49 - INFO - transformers.tokenization_utils -   Model name 'microsoft/codebert-base' not found in model shortcut name list (roberta-base, roberta-large, roberta-large-mnli, distilroberta-base, ro
berta-base-openai-detector, roberta-large-openai-detector). Assuming 'microsoft/codebert-base' is a path, a model identifier, or url to a directory containing tokenizer files.
10/13/2021 06:35:50 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/codebert-base/vocab.json from cache at /home/ubuntu/.cache/torch/transfo
rmers/aca4dbdf4f074d4e071c2664901fec33c8aa69c35aa0101bc669ed4b44d1f6c3.6a4061e8fc00057d21d80413635a86fdcf55b6e7594ad9e25257d2f99a02f4be
10/13/2021 06:35:50 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/codebert-base/merges.txt from cache at /home/ubuntu/.cache/torch/transfo
rmers/779a2f0c38ba2ff65d9a3ee23e58db9568f44a20865c412365e3dc540f01743f.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
10/13/2021 06:35:50 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/codebert-base/added_tokens.json from cache at None
10/13/2021 06:35:50 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/codebert-base/special_tokens_map.json from cache at /home/ubuntu/.cache/
torch/transformers/5a191080da4f00859b5d3d29529f57894583e00ab07b7c940d65c33db4b25d4d.16f949018cf247a2ea7465a74ca9a292212875e5fd72f969e0807011e7f192e4
10/13/2021 06:35:50 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/codebert-base/tokenizer_config.json from cache at /home/ubuntu/.cache/to
rch/transformers/1b4723c5fb2d933e11c399450ea233aaf33f093b5cbef3ec864624735380e490.70b5dbd5d3b9b4c9bfb3d1f6464291ff52f6a8d96358899aa3834e173b45092d
10/13/2021 06:35:51 - INFO - transformers.modeling_utils -   loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/codebert-base/pytorch_model.bin from cache at /home/ubuntu/.cache/to
rch/transformers/3416309b564f60f87c1bc2ce8d8a82bb7c1e825b241c816482f750b48a5cdc26.96251fe4478bac0cff9de8ae3201e5847cee59aebbcafdfe6b2c361f9398b349
10/13/2021 06:35:55 - INFO - __main__ -   reload model from model/python/checkpoint-best-ppl/pytorch_model.bin
Traceback (most recent call last):
  File "run.py", line 575, in <module>
    main()
  File "run.py", line 321, in main
    model.load_state_dict(torch.load(args.load_model_path))
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Seq2Seq:
        Unexpected key(s) in state_dict: "encoder.embeddings.position_ids".

the command that I used is

# train
cd code
lang=python #programming language
lr=5e-5
batch_size=16
beam_size=10
source_length=256
target_length=128
data_dir=../dataset
output_dir=lang2code_from_epoch_6_model/$lang
train_file=$data_dir/$lang/train.jsonl
dev_file=$data_dir/$lang/valid.jsonl
epochs=10
pretrained_model=microsoft/codebert-base #Roberta: roberta-base
load_model_path=model/python/checkpoint-best-ppl/pytorch_model.bin

python3 run.py --do_train --do_eval --model_type roberta --load_model_path $load_model_path --model_name_or_path $pretrained_model --train_filename $train_file --train_filename_lang2code $train_filename_lang2code --dev_filename $dev_file --output_dir $output_dir --max_source_length $source_length --max_target_length $target_length --beam_size $beam_size --train_batch_size $batch_size --eval_batch_size $batch_size --learning_rate $lr --num_train_epochs $epochs --gradient_accumulation_steps 2
guoday commented 3 years ago

This's because of different version of transformers. You can update transformers by "pip install --upgrade transformers"