pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

Missing Keys in state_dict #172

Open bjohn22 opened 1 month ago

bjohn22 commented 1 month ago

I downloaded nvidia/Llama3-ChatQA-1.5-8B manually from HF into local. I ran scripts/convert_hf_checkpoint.py Then I wanted to run generate.py using the local checkpoint dir:

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Transformer: Missing key(s) in state_dict: "tok_embeddings.weight", "layers.0.attention.wqkv.weight", "layers.0.attention.wo.weight", "layers.0.feed_forward.w1.weight", "layers.0.feed_forward.w3.weight", "layers.0.feed_forward.w2.weight", "layers.0.ffn_norm.weight", "layers.0.attention_norm.weight",

Here is my weight directory: image

jxtngx commented 4 weeks ago

Are you still having this issue?

Can you share the original state-dict's layer names, and also the converted layer names.