google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.98k stars 508 forks source link

Failed to read from model.weights.h5 - might be a directory, or too small? #46

Closed charbull closed 8 months ago

charbull commented 8 months ago

Hi,

I am experiencing the follow issue, I tried the following versions: https://www.kaggle.com/models/keras/gemma/frameworks/Keras/variations/gemma_2b_en/versions/1 https://www.kaggle.com/models/keras/gemma/frameworks/Keras/variations/gemma_2b_en/versions/2

/gemma \                                         
--tokenizer vocabulary.spm \
--weights model.weights.h5 \
--compressed_weights 2b-pt-sfp.sbs  --model 2b-pt --verbosity 2
Cached compressed weights does not exist yet (code 256), compressing weights and creating file: 2b-pt-sfp.sbs.
Abort at /Users/charbel/Downloads/gemma/gemma.cpp/./gemma.cc:138: Failed to read from model.weights.h5 - might be a directory, or too small?
zsh: abort      ./gemma --tokenizer vocabulary.spm --weights model.weights.h5  2b-pt-sfp.sbs 

Any ideas how to resolve?

Cheers, Charbel

austinvhuang commented 8 months ago

Hi, in general you shouldn't use the --weights parameter at this time.

In the future you can use it to load fine tuned weights and make compressed versions, but that requires a python script to convert weights https://github.com/google/gemma.cpp/issues/11

Instead, only use the compressed weights (you don't need the keras weights, just the sfp files from the GemmaCpp download page).

Second, you probably want to start with the -it "instruction tuned" models which are more appropriate for interactive use. the -pt "pretrained models" are more of a starting point for fine tuning. So in summary:

austinvhuang commented 8 months ago

Closing for now but if you still run into an issue we'll reopen and help.

charbull commented 8 months ago

Thanks ! that worked, I was using the keras weights.