zhuyiche / llava-phi

361 stars 38 forks source link

[Error] app.py(Gradio_) #8

Closed BigJoon closed 8 months ago

BigJoon commented 8 months ago

I put the sample into app.py written with gradio in your repo and ran it, but an error occurred. Can you recognize this error?

The model used this time was llavaPhi, which you uploaded on huggingface...


Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 1764, in generate
    return self.sample(
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 2861, in sample
    outputs = self(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/workspace/llava-phi/llava_phi/model/language_model/llava_phi.py", line 59, in forward
    input_ids, attention_mask, past_key_values, inputs_embeds, labels = self.prepare_inputs_labels_for_multimodal(
  File "/workspace/workspace/llava-phi/llava_phi/model/llava_arch.py", line 73, in prepare_inputs_labels_for_multimodal
    image_features = self.encode_images(images)
  File "/workspace/workspace/llava-phi/llava_phi/model/llava_arch.py", line 54, in encode_images
    image_features = self.get_model().mm_projector(image_features)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype```
JLM-Z commented 8 months ago

Sorry, we haven't tested the local demo yet. This error occurs if llavaPhi is loaded in fp32 but the image tensor is encoded in fp16. You can modify this code to ensure the same dtype.

Hope this helps you!

BigJoon commented 8 months ago

@JLM-Z oh!! you are right!

That error disappear but another one just rised up. When I put in an image and say "hi", it comes out like this...

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/gradio/routes.py", line 437, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in process_api
    result = await self.call_function(
  File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1093, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 341, in async_iteration
    return await iterator.__anext__()
  File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 334, in __anext__
    return await anyio.to_thread.run_sync(
  File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 317, in run_sync_iterator_async
    return next(iterator)
  File "/workspace/workspace/llava-phi/llava_phi/serve/app.py", line 209, in http_bot
    for chunk in output:
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/workspace/workspace/llava-phi/llava_phi/serve/app.py", line 160, in get_response
    for new_text in streamer:
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/streamers.py", line 223, in __next__
    value = self.text_queue.get(timeout=self.timeout)
  File "/opt/conda/lib/python3.10/queue.py", line 179, in get
    raise Empty
_queue.Empty
JLM-Z commented 8 months ago

Sorry, as there is no time to test the local demo at the moment, you can use the following script to chat about images without the need of Gradio interface:

python -m llava_phi.serve.cli \
    --model-path  /path/to/checkpoints/llava \
    --image-file ./images/03-Confusing-Pictures.jpg \
    --conv-mode "phi-2_v0"
BigJoon commented 8 months ago

thank you. Thanks to you, I think all the problems have been resolved. I'll close the issue!