error loading model: missing tok_embeddings.weight

sp-yang commented 1 year ago

System Info

gpt4all: 0.3.6 python: 3.9.16

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[ ] backend
[ ] bindings
[X] python-bindings
[ ] chat-ui
[ ] models
[ ] circleci
[ ] docker
[ ] api

Reproduction

import gpt4all falcon = gpt4all.GPT4All(model_name="ggml-model-gpt4all-falcon-q4_0.bin", model_path="./LLM/") messages = [{"role": "user", "content": "Name 3 colors"}] falcon.chat_completion(messages)

Expected behavior

load the model successfully, however I got the following message: error loading model: missing tok_embeddings.weight llama_init_from_file: failed to load model

This model can be loaded in gpt4all UI client successfully, but not in python. Is this a model issue or python code issue? Thanks.

lucianosilvi commented 1 year ago

I managed to load it with transformers, but running the generation this way is too slow compared to the gpt4all UI that is very fast.

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-falcon", trust_remote_code=True)

sp-yang commented 1 year ago

@lucianosilvi Thanks for your reply. The model ggml-model-gpt4all-falcon-q4_0.bin, which was downloaded from https://gpt4all.io/, cannot be loaded in python bindings for gpt4all. Is there a way to load it in python and run faster?

nomic-ai / gpt4all