Closed MartinKlefas closed 1 year ago
@MartinKlefas make sure you have only merged.pth and correct params.json (and pyarrow folder) in the model folder. This error indicates torch was unable to load weights correctly, probably there's more than one .pth file in the model folder. Check also if your params.json correctly matching to the model; it seems you have used params.json file from another model.
Thanks I think I moved the wrong json file, as it's working now. 5 minutes between prompt and answer apparently, but that's my fault for having "only" 64GB of RAM and a 2 year old GPU.
Scrub that, it's 5 minutes for each of the bottom progress bars to move. 31 hours to an answer!
@MartinKlefas I also feel the pain when trying to inference 65B model :) 30B acts well too, expecting complete 2048 tokens in only a 4 hours :) But I'm stopping much earlier.
I've managed to complete all the steps but the last, and when I run 'python example-chat.py ./model ./tokenizer/tokenizer.model'
I wait a few minutes then get a lot of error lines like: