oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.92k stars 5.34k forks source link

Cannot use Llava (1.5) with AutoAWQ: 'LlamaAWQForCausalLM' object has no attribute 'device' #4308

Closed Subarasheese closed 1 year ago

Subarasheese commented 1 year ago

Describe the bug

Hello,

I downloaded this model: https://huggingface.co/TheBloke/llava-v1.5-13B-AWQ

Then I am trying to use the 'multimodal' extension, and load it using the AutoAWQ loader.

I am using this command to try to load it:

python3 server.py --model TheBloke_llava-v1.5-13B-AWQ --multimodal-pipeline llava-llama-2-13b --loader AutoAWQ

Chat works fine, however, when I upload an image, it fails. See the logs.

Is there an existing issue for this?

Reproduction

python3 server.py --model TheBloke_llava-v1.5-13B-AWQ --multimodal-pipeline llava-llama-2-13b --loader AutoAWQ

Screenshot

No response

Logs

Traceback (most recent call last):
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_ABC/queueing.py", line 406, in call_prediction
    output = await route_utils.call_process_api(
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_DEF/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_GHI/blocks.py", line 1554, in process_api
    result = await self.call_function(
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_JKL/blocks.py", line 1206, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_MNO/utils.py", line 517, in async_iteration
    return await iterator.__anext__()
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_PQR/utils.py", line 510, in __anext__
    return await anyio.to_thread.run_sync(
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2106, in run_sync_in_worker_thread
    return await future
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 833, in run
    result = context.run(func, *args)
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/SOFTWARE_STU/utils.py", line 493, in run_sync_iterator_async
    return next(iterator)
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/modules/chat.py", line 329, in generate_chat_reply_wrapper
    for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True)):
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/modules/chat.py", line 297, in generate_chat_reply
    for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message):
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/modules/chat.py", line 237, in chatbot_wrapper
    for j, reply in enumerate(generate_reply(prompt, state, stopping_strings=stopping_strings, is_chat=True)):
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/modules/text_generation.py", line 30, in generate_reply
    for result in _generate_reply(*args, **kwargs):
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/modules/text_generation.py", line 77, in _generate_reply
    for reply in generate_func(question, original_question, seed, state, stopping_strings, is_chat=is_chat):
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/modules/text_generation.py", line 309, in generate_reply_HF
    question, input_ids, inputs_embeds = apply_extensions('tokenizer', state, question, input_ids, None)
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/extensions/multimodal/script.py", line 89, in tokenizer_modifier
    prompt, input_ids, input_embeds, total_embedded = multimodal_embedder.forward(prompt, state, params)
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/extensions/multimodal/multimodal_embedder.py", line 171, in forward
    prompt_parts = self._encode_text(state, prompt_parts)
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/extensions/multimodal/multimodal_embedder.py", line 105, in _encode_text
    encoded.append(self._encode_single_text(part, i == 0 and state['add_bos_token']))
  File "/media/USERNAME123/DISK_1TB/PROJECT_XYZ/extensions/multimodal/multimodal_embedder.py", line 85, in _encode_single_text
    part.input_ids = encode(part.text, add_bos_token=add_bos_token)[0].to(shared.model.device, dtype=torch.int64)
  File "/home/USERNAME123/.conda/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1695, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'LlamaAWQForCausalLM' object has no attribute 'device'

System Info

Arch Linux
RTX 3090, CUDA 11.8
github-actions[bot] commented 1 year ago

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

Nrgte commented 9 months ago

I have the same issue, does anyone know whether there is a fix for this?