Lightning-Universe / lightning-transformers

Flexible components pairing 🤗 Transformers with :zap: Pytorch Lightning
https://lightning-transformers.readthedocs.io
Apache License 2.0
607 stars 77 forks source link

language model load from checkpoint error #295

Closed omerarshad closed 1 year ago

omerarshad commented 1 year ago

🐛 Bug

Saving aggregated checpoint for language modeling transformer gives error

RuntimeError: Error(s) in loading state_dict for LanguageModelingTransformer:
    Missing key(s) in state_dict: "model.lm_head.weight". 

To Reproduce

from pytorch_lightning.utilities.deepspeed import convert_zero_checkpoint_to_fp32_state_dict

convert_zero_checkpoint_to_fp32_state_dict(
    "./recreate_model/epoch=0-step=363.ckpt/",
    "./recreate_model/pytorch_model.bin"
 )

# Load best model from aggregated checkpoint file
best_model = LanguageModelingTransformer.load_from_checkpoint(
    "./recreate_model/pytorch_model.bin"
)
Borda commented 1 year ago

could you please share the full trace? :rabbit: seems to be a duplicate of https://github.com/Lightning-AI/lightning-transformers/issues/273#issuecomment-1243397261 so lets keep only one :otter: