Closed bbecausereasonss closed 1 year ago
It looks like the issue is that you're trying to load in 4bit mode but the webui doesn't detect the model type so it fails. Try adding --gptq-model-type LLaMa
to your launch arguments.
Exactly as @BetaDoggo says. Add --model_type llama
to your command.
Describe the bug
Loaded models into model folder. Tried to select them. Got errors. Confused.
Does this only support .pt models?
"llama-text-generation-webui-1 | Loading Llama-30b HFv2... llama-text-generation-webui-1 | Could not find llama-30b-4bit.pt, exiting... llama-text-generation-webui-1 | Traceback (most recent call last): llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/gradio/routes.py", line 374, in run_predict llama-text-generation-webui-1 | output = await app.get_blocks().process_api( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1017, in process_api llama-text-generation-webui-1 | result = await self.call_function( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 835, in call_function llama-text-generation-webui-1 | prediction = await anyio.to_thread.run_sync( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync llama-text-generation-webui-1 | return await get_asynclib().run_sync_in_worker_thread( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread llama-text-generation-webui-1 | return await future llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run llama-text-generation-webui-1 | result = context.run(func, *args) llama-text-generation-webui-1 | File "/app/server.py", line 63, in load_model_wrapper llama-text-generation-webui-1 | shared.model, shared.tokenizer = load_model(shared.model_name) llama-text-generation-webui-1 | File "/app/modules/models.py", line 100, in load_model llama-text-generation-webui-1 | model = load_quantized(model_name) llama-text-generation-webui-1 | File "/app/modules/GPTQ_loader.py", line 53, in load_quantized llama-text-generation-webui-1 | exit() llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/_sitebuiltins.py", line 26, in call llama-text-generation-webui-1 | raise SystemExit(code) llama-text-generation-webui-1 | SystemExit: None"
and
"llama-text-generation-webui-1 | Can't determine model type from model name. Please specify it manually using --gptq-model-type argument llama-text-generation-webui-1 | Traceback (most recent call last): llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/gradio/routes.py", line 374, in run_predict llama-text-generation-webui-1 | output = await app.get_blocks().process_api( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1017, in process_api llama-text-generation-webui-1 | result = await self.call_function( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 835, in call_function llama-text-generation-webui-1 | prediction = await anyio.to_thread.run_sync( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync llama-text-generation-webui-1 | return await get_asynclib().run_sync_in_worker_thread( llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread llama-text-generation-webui-1 | return await future llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run llama-text-generation-webui-1 | result = context.run(func, *args) llama-text-generation-webui-1 | File "/app/server.py", line 63, in load_model_wrapper llama-text-generation-webui-1 | shared.model, shared.tokenizer = load_model(shared.model_name) llama-text-generation-webui-1 | File "/app/modules/models.py", line 100, in load_model llama-text-generation-webui-1 | model = load_quantized(model_name) llama-text-generation-webui-1 | File "/app/modules/GPTQ_loader.py", line 21, in load_quantized llama-text-generation-webui-1 | exit() llama-text-generation-webui-1 | File "/opt/conda/lib/python3.10/_sitebuiltins.py", line 26, in call llama-text-generation-webui-1 | raise SystemExit(code) llama-text-generation-webui-1 | SystemExit: None"
Is there an existing issue for this?
Reproduction
Load models.
Screenshot
No response
Logs
System Info