oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
39.24k stars 5.17k forks source link

[Error]: Can't load model #2662

Closed Woisek closed 1 year ago

Woisek commented 1 year ago

Describe the bug

No matter what model I load, it always produces an error (wizardLM-7B-GPTQ-4bit-128g, wizard-vicuna-7b-uncensored-gptq-4bit-128g no-act-order safetensors).

Is there an existing issue for this?

Reproduction

Loading a model

Screenshot

Zwischenablagebild (2)

Logs

2023-06-13 12:00:59 INFO:Loading wizardLM-7B-GPTQ-4bit-128g...
2023-06-13 12:00:59 INFO:The AutoGPTQ params are: {'model_basename': 'wizardlm-7b-gptq-4bit-128g.ooba.no-act-order.2', 'device': 'cuda:0', 'use_triton': False, 'use_safetensors': False, 'trust_remote_code': False, 'max_memory': None, 'quantize_config': BaseQuantizeConfig(bits=4, group_size=128, damp_percent=0.01, desc_act=False, sym=True, true_sequential=True, model_name_or_path=None, model_file_base_name=None)}
2023-06-13 12:01:00 WARNING:The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
2023-06-13 12:01:03 WARNING:skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet.

Traceback (most recent call last): File “F:\Programme\oobabooga_windows\text-generation-webui\server.py”, line 70, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\models.py”, line 102, in load_model tokenizer = load_tokenizer(model_name, model) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\models.py”, line 127, in load_tokenizer tokenizer = LlamaTokenizer.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}/“), clean_up_tokenization_spaces=True) File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py”, line 1825, in from_pretrained return cls.from_pretrained( File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py”, line 1988, in from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\llama\tokenization_llama.py”, line 96, in init self.sp_model.Load(vocab_file) File "F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\sentencepiece_init.py", line 905, in Load return self.LoadFromFile(model_file) File "F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\sentencepiece_init.py”, line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) TypeError: not a string

System Info

Windows 10, Nvidia RTX 2080 SUPER 8GB VRAM
Woisek commented 1 year ago

Loading the mayaeary_pygmalion-6b_dev-4bit-128g model throws:

Traceback (most recent call last): File “F:\Programme\oobabooga_windows\text-generation-webui\[server.py](http://server.py/)”, line 70, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\[models.py](http://models.py/)”, line 94, in load_model output = load_func(model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\[models.py](http://models.py/)”, line 296, in AutoGPTQ_loader return modules.AutoGPTQ_loader.load_quantized(model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\AutoGPTQ_loader.py”, line 60, in load_quantized model.embed_tokens = model.model.model.embed_tokens File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\torch\nn\modules\[module.py](http://module.py/)”, line 1614, in getattr raise AttributeError(“‘{}’ object has no attribute ‘{}’”.format( AttributeError: ‘GPTJForCausalLM’ object has no attribute ‘model’

Only the the facebook model works (facebook_opt-6.7b). 😞

peaceandtr commented 1 year ago

Same problem. 2 days ago everything worked fine, but today I reinstalled ooga, and it happened. Problem with the same model, but 7b still works https://huggingface.co/AnimusOG/pygmalion-7b-4bit-128g-cuda-2048Token

jllllll commented 1 year ago

Which of these did you download? https://huggingface.co/Aitrepreneur/wizardLM-7B-GPTQ-4bit-128g https://huggingface.co/TheBloke/wizardLM-7B-GPTQ

You may need to select the gptq-for-llama option and set wbits to 4 and groupsize to 128

Iory1998 commented 1 year ago

Loading the mayaeary_pygmalion-6b_dev-4bit-128g model throws:

Traceback (most recent call last): File “F:\Programme\oobabooga_windows\text-generation-webui\[server.py](http://server.py/)”, line 70, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\[models.py](http://models.py/)”, line 94, in load_model output = load_func(model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\[models.py](http://models.py/)”, line 296, in AutoGPTQ_loader return modules.AutoGPTQ_loader.load_quantized(model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\AutoGPTQ_loader.py”, line 60, in load_quantized model.embed_tokens = model.model.model.embed_tokens File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\torch\nn\modules\[module.py](http://module.py/)”, line 1614, in getattr raise AttributeError(“‘{}’ object has no attribute ‘{}’”.format( AttributeError: ‘GPTJForCausalLM’ object has no attribute ‘model’

Only the the facebook model works (facebook_opt-6.7b). 😞

Same here! So, maybe a bad update did mess up things. Note that these models work for me: koala-7B-GPTQ-4bit-128g TheBloke_guanaco-7B-GPTQ TheBloke_WizardLM-7B-uncensored-GPTQ WizardLM-7B-Original-GPTQ

Woisek commented 1 year ago

Which of these did you download? https://huggingface.co/Aitrepreneur/wizardLM-7B-GPTQ-4bit-128g https://huggingface.co/TheBloke/wizardLM-7B-GPTQ

You may need to select the gptq-for-llama option and set wbits to 4 and groupsize to 128

I use the wizardLM-7B-GPTQ-4bit-128g. Right, thanks for this hint, now I get an idea why the names are like that. 🤪 I didn't know that those are settings to use. That's the good one. The bad one is, I still get an "error". When I use your mentioned settings, I get

2023-06-14 08:52:34 INFO:Loading wizardLM-7B-GPTQ-4bit-128g... 2023-06-14 08:52:34 ERROR:The model could not be loaded because its type could not be inferred from its name. 2023-06-14 08:52:34 ERROR:Please specify the type manually using the --model_type argument.

Model type is "none". But much worse: all other three options again throw the error

Traceback (most recent call last): File “F:\Programme\oobabooga_windows\text-generation-webui[server.py](http://server.py/)”, line 70, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules[models.py](http://models.py/)”, line 102, in load_model tokenizer = load_tokenizer(model_name, model) File “F:\Programme\oobabooga_windows\text-generation-webui\modules[models.py](http://models.py/)”, line 127, in load_tokenizer tokenizer = LlamaTokenizer.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}/“), clean_up_tokenization_spaces=True) File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py”, line 1825, in from_pretrained return cls.from_pretrained( File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py”, line 1988, in from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File “F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\llama\tokenization_llama.py”, line 96, in init self.sp_model.Load(vocab_file) File "F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\sentencepiece_init.py", line 905, in Load return self.LoadFromFile(model_file) File "F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\sentencepiece_init.py”, line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) TypeError: not a string

🤔

I also would like to use mayaeary_pygmalion-6b_dev-4bit-128g, but "dev" is no model-type to select. And I would like to use wizard-vicuna-7b-uncensored-gptq-4bit-128g but that results in an endless loading. Any hint on that, too, please? 😞

DeathW1ng39 commented 1 year ago

So guys, I had the same issue, but after switching to the old version it fixed (though keep in mind that this will delete all the features from the new versions). I'll now walk you through the steps I took in order to make it work.

1) Open the text-generation-webui folder 2) type "cmd" in the folder's address. 3) put this command "git checkout 5543a5089d455708bae3b27e2461ac25e4482da6"

And logically, that's it. I've taken the link from Aitrepreneur's video, so all the credit should go to him if that helps you

LDzik commented 1 year ago

So guys, I had the same issue, but after switching to the old version it fixed (though keep in mind that this will delete all the features from the new versions). I'll now walk you through the steps I took in order to make it work.

1. Open the text-generation-webui folder

2. type "cmd" in the folder's address.

3. put this command "git checkout [5543a50](https://github.com/oobabooga/text-generation-webui/commit/5543a5089d455708bae3b27e2461ac25e4482da6)"

And logically, that's it. I've taken the link from Aitrepreneur's video, so all the credit should go to him if that helps you

not sure why but now for some reason my python stops working when model is loading

jllllll commented 1 year ago

@Woisek Set the Model Type option in the models tab to llama. You can also add -llama to the model folder's name.

Originalimoc commented 1 year ago

Are you missing tokenizer.model file?

Iory1998 commented 1 year ago

@Woisek Set the Model Type option in the models tab to llama. You can also add -llama to the model folder's name.

I already do that. I don't know why but I can't use AutoGPTQ. I use ExLlama though, so it's fine.

github-actions[bot] commented 1 year ago

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.