randaller / llama-chat

Chat with Meta's LLaMA models at home made easy
GNU General Public License v3.0
833 stars 118 forks source link

Error on run: size mismatch for ... #9

Closed MartinKlefas closed 1 year ago

MartinKlefas commented 1 year ago

I've managed to complete all the steps but the last, and when I run 'python example-chat.py ./model ./tokenizer/tokenizer.model'

I wait a few minutes then get a lot of error lines like:

size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([32000, 6656]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
size mismatch for layers.39.ffn_norm.weight: copying a param with shape torch.Size([6656]) from checkpoint, the shape in current model is torch.Size([5120]).
        size mismatch for norm.weight: copying a param with shape torch.Size([6656]) from checkpoint, the shape in current model is torch.Size([5120]).
        size mismatch for output.weight: copying a param with shape torch.Size([32000, 6656]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
randaller commented 1 year ago

@MartinKlefas make sure you have only merged.pth and correct params.json (and pyarrow folder) in the model folder. This error indicates torch was unable to load weights correctly, probably there's more than one .pth file in the model folder. Check also if your params.json correctly matching to the model; it seems you have used params.json file from another model.

MartinKlefas commented 1 year ago

Thanks I think I moved the wrong json file, as it's working now. 5 minutes between prompt and answer apparently, but that's my fault for having "only" 64GB of RAM and a 2 year old GPU.

MartinKlefas commented 1 year ago

Scrub that, it's 5 minutes for each of the bottom progress bars to move. 31 hours to an answer!

image

randaller commented 1 year ago

@MartinKlefas I also feel the pain when trying to inference 65B model :) 30B acts well too, expecting complete 2048 tokens in only a 4 hours :) But I'm stopping much earlier.