Open ktrapeznikov opened 1 week ago
Sometimes it works landscape images of certain sizes. Some times it also crashes. Do images sizes have to be multiples of 336?
Same problem Method Prefill encountered an error
It seems that the current implementation counts the tokens generated from the encoded image as part of the prompt length. It might be better to extract the image features first and then calculate the prompt token length separately. I'm not sure if TGI has support for this approach, as it could be quite involved.
Same issue, only width == height image works
I have the same issue, it seems to be linked to image sizes. I found that some sizes work in TGI v2.0.1 but not in TGI v2.0.2, and inversely.
I made here a recap for image size I tested. Note that the 2-bis image is the 2 image cropped, to ensure that the dimension is causing the issue.
Image | dimension | ratio L/W | works in v2.0.1 | works in v2.0.2 |
---|---|---|---|---|
1 | 450 x 299 | 1.505 | No | Yes |
2 | 800 x 531 | 1.506 | Yes | No |
2 bis | 450 x 299 | 1.505 | No | Yes |
3 | 300 x 168 | 1.785 | No | Yes |
4 | 640 x 480 | 1. 333 | Yes | Yes |
5 | 934 x 934 (square) | 1 | Yes | Yes |
When the image hasn't the right dimension, the server encounters an error and crashes. Here are the logs I get:
v2.0.1 (image 1 crash)
ERROR text_generation_launcher: Method Prefill encountered an error.
...
RuntimeError: shape mismatch: value tensor of shape [1464, 4096] cannot be broadcast to indexing result of shape [1376, 4096]
...
ERROR batch{batch_size=1}:prefill:prefill{id=0 size=1}:prefill{id=0 size=1}: text_generation_client: router/client/src/lib.rs:33: Server error: CANCELLED
ERROR batch{batch_size=1}:prefill:clear_cache{batch_id=Some(0)}:clear_cache{batch_id=Some(0)}: text_generation_client: router/client/src/lib.rs:33: Server error: transport error
ERROR chat_completions:generate:generate_stream:infer:send_error: text_generation_router::infer: router/src/infer.rs:866: Request failed during generation: Server error: CANCELLED
...
ERROR text_generation_launcher: Shard 0 crashed
v2.0.2 (image 2 crash, not happening at warmup)
INFO text_generation_launcher: Found 2095 in image of resolution 531x800
ERROR text_generation_launcher: Method Prefill encountered an error.
...
RuntimeError: shape mismatch: value tensor of shape [2144, 4096] cannot be broadcast to indexing result of shape [2095, 4096]
...
RuntimeError: Cannot fill images right now. If error happens at warmup, make sure you have enough `--max-input-tokens` to handle images. If error happens at regular runtime, please fill in an issue: shape mismatch: value tensor of shape [2144, 4096] cannot be broadcast to indexing result of shape [2095, 4096]
...
ERROR batch{batch_size=1}:prefill:prefill{id=0 size=1}:prefill{id=0 size=1}: text_generation_client: router/client/src/lib.rs:33: Server error: CANCELLED
ERROR batch{batch_size=1}:prefill:clear_cache{batch_id=Some(0)}:clear_cache{batch_id=Some(0)}: text_generation_client: router/client/src/lib.rs:33: Server error: transport error
ERROR chat_completions:generate:generate_stream:infer:send_error: text_generation_router::infer: router/src/infer.rs:866: Request failed during generation: Server error: CANCELLED
...
ERROR text_generation_launcher: Shard 0 crashed
My model info
{
model_id: "llava-hf/llava-v1.6-mistral-7b-hf",
validation_workers: 2,
trust_remote_code: false,
max_concurrent_requests: 128,
max_best_of: 2,
max_stop_sequences: 4,
max_top_n_tokens: 5,
max_input_tokens: Some(4000),
max_total_tokens: Some(5000),
waiting_served_ratio: 0.3,
max_waiting_tokens: 20,
hostname: "0.0.0.0",
port: 80,
shard_uds_path: "/tmp/text-generation-server",
master_addr: "localhost",
master_port: 29500,
huggingface_hub_cache: Some("/data"),
disable_custom_kernels: false,
cuda_memory_fraction: 1.0,
json_output: false,
cors_allow_origin: [],
ngrok: false,
disable_grammar_support: false,
env: false,
max_client_batch_size: 4,
}
System Info
Running in docker
CLI Arguments
Info
Information
Tasks
Reproduction
Here is a script that I run on this image with the prompt
Describe the image?
. Note the image is (286 × 524). It returns an error and the service crashes.Logs from the tgi service
Expected behavior
When I run the same script on an image that's square (554x554), it behaves as expected.
Response
Logs from cgi