Closed EugeoSynthesisThirtyTwo closed 10 months ago
running it through session is also troughing error. the way you told if I run I am having error I wonder how its working on your system
running it through session is also troughing error. the way you told if I run I am having error I wonder how its working on
I don't know how it works to load the model on my system :/ Did you somehow manage to run it using a different method ? (other command line args for instance)
@oobabooga I had the same issue, when not specifiying loader as autogptq, the embed_token method cannot be found. There must be a bug for the default loader for gptq llava v1.5 models
@EugeoSynthesisThirtyTwo You need to for specifiy autogptq and disable exllama for the time being
MODEL=llava-v1.5-13B-GPTQ
python server.py --model $MODEL \
--loader autogptq \
--disable_exllama \
--multimodal-pipeline llava-v1.5-13b
@yhyu13 Thank you, I got rid of this error but unfortunately I have another error now I create a new bug report here https://github.com/oobabooga/text-generation-webui/issues/4398 because I don't know if it's related
Nevermind, I made a mistake ! Now it works thank you ! I guess I should leave this thread open since there is a bug when I follow the guide strictly
C:\AI\text-generation-webui>python server.py --model TheBloke_llava-v1.5-13B-GPTQ_gptq-4bit-32g-actorder_True --multimodal-pipeline llava-v1.5-13b --disable_exllama --loader autogptq bin C:\Users\Govind\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so C:\Users\Govind\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " function 'cadam32bit_grad_fp32' not found 2023-10-28 00:13:44 INFO:Loading settings from settings.yaml... 2023-10-28 00:13:44 INFO:Loading TheBloke_llava-v1.5-13B-GPTQ_gptq-4bit-32g-actorder_True... 2023-10-28 00:13:44 INFO:The AutoGPTQ params are: {'model_basename': 'model', 'device': 'cuda:0', 'use_triton': False, 'inject_fused_attention': True, 'inject_fused_mlp': True, 'use_safetensors': True, 'trust_remote_code': False, 'max_memory': None, 'quantize_config': None, 'use_cuda_fp16': True, 'disable_exllama': True} 2023-10-28 00:13:44 WARNING:CUDA kernels for auto_gptq are not installed, this will result in very slow inference speed. This may because:
I am having this error anyone can help?
@Govindai1
Are you on a cuda device? Then you need to install pytorch using apporaiate cuda version. Checkout pytorch officla website download section.
Correct me if I am wrong, textgen mostly support torch 2.0.1 cuda118
In case somebody else has the same issue : I had the "AttributeError: 'NoneType' object has no attribute 'lower'" message on my Windows 11 PC, it finally went away when I used the file CMD_FLAGS.txt to set the command line options, rather than using the environment variable OOBABOOGA_FLAGS.
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
Describe the bug
LLAVA can write text but it raises an error when trying to read an image
Is there an existing issue for this?
Reproduction
TheBloke/llava-v1.5-13B-GPTQ:gptq-4bit-32g-actorder_True
python server.py --model TheBloke_llava-v1.5-13B-GPTQ_gptq-4bit-32g-actorder_True --multimodal-pipeline llava-v1.5-13b
Screenshot
Logs
System Info