abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.65k stars 919 forks source link

Assertion Error while Loading a Phi2 GGUF Model via llama-cpp-python #1076

Open Deepansharora27 opened 8 months ago

Deepansharora27 commented 8 months ago

I am Trying to Load Phi2 GGUF Model via llama-cpp-python dependency The Model I am trying to load is this: https://huggingface.co/TheBloke/phi-2-GGUF

I am Getting this Error while I am trying to Load the Model:

Screenshot 2024-01-10 at 2 23 26 PM
abhijit156 commented 8 months ago

I am facing similar issues with TheBloke's other GGUF models, specifically Llama 7B and Mixtral. Been oscillating between this 'AssertionError', 'Cannot infer suitable class', and 'model does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack'. Tried all fixes found online. Anyone have any tips to get any of these GGUF models running on an M1 Mac? To be clear, running it through llama.cpp CLI seems to work, but unable to run from Python code

abetlen commented 8 months ago

Is there any error above that? That indicates a model load error but that could be from an invalid path or if the files weren't downloaded properly, it's hard to tell without more info.

For reference I've been actually able to load that model with the latest version so it should work.

acon96 commented 7 months ago

Microsoft changed the model module names for Phi for the third time which broke llama.cpp.

Support for the new format was added as of a few hours ago: https://github.com/ggerganov/llama.cpp/commit/15ebe59210e7fd9817ff67f51fa1a5ee2d004294 Should just require a version update to fix this

abetlen commented 7 months ago

@acon96 will update tonight asap then!