Open metemadi opened 7 months ago
Thank you for the report! Will try to take a look later.
Thank you! I found the issue - the fix is here for a similar IDEFICS (larger one) model: https://huggingface.co/HuggingFaceM4/idefics-80b-instruct/discussions/10/files
I think the base64 encoded image is max'ing the sequence length - when you add the regex to truncate the image portion, it seems to work perfectly! I wonder if this should also be done in the URL case as well (the URL case it appears keeps the URL in the context of the model which I am guessing is not the intent)?
Not sure the best place to apply this fix - is it in a particular model version? In TGI? etc. I am doing it in a really hacky way where I run my dockers, stop them, go into the TGI one's storage mount, modify the tokenizer json file as per the above, and then run them again.. messy but it works!
Thank you again for such an amazing set of products!
I am experiencing the same error. But when I apply your solution below I get another error on the TGI related to the shape mismatch as in the thread below.
https://github.com/huggingface/transformers/issues/31380
I am using the latest TGI with HuggingFaceM4/idefics2-8b
Any clue?
Thank you! I found the issue - the fix is here for a similar IDEFICS (larger one) model: https://huggingface.co/HuggingFaceM4/idefics-80b-instruct/discussions/10/files
I think the base64 encoded image is max'ing the sequence length - when you add the regex to truncate the image portion, it seems to work perfectly! I wonder if this should also be done in the URL case as well (the URL case it appears keeps the URL in the context of the model which I am guessing is not the intent)?
Not sure the best place to apply this fix - is it in a particular model version? In TGI? etc. I am doing it in a really hacky way where I run my dockers, stop them, go into the TGI one's storage mount, modify the tokenizer json file as per the above, and then run them again.. messy but it works!
Thank you again for such an amazing set of products!
Hi! Sorry I am really not sure - definitely not an expert on this stuff. The one thing is I can't remember if I turned "normalization" on or if I just did the regex thing - maybe thats it? (It was a while ago). Separately though the shape thing seems odd to me and perhaps unrelated? Sorry I can't be of more help but perhaps others will reply!
Hi there,
I am hosting IDEFICS on TGI locally and hitting it via Chat-UI. When I go to a URL, it is able to correctly work with images, as seen here:
However, when I use the "upload image" feature, or the drag and drop, it looks like it is ignoring the image:
Here is my model config (using the latest main branch of chat-ui 0.7.0):
{ "name": "HuggingFaceM4/idefics-9b-instruct", "endpoints":[{"type":"tgi","url":"http://INTERNAL URL:/generate_stream"}], "multimodal" : true, "description": "IDEFICS is the new multimodal model by Hugging Face.", "preprompt": "", "chatPromptTemplate" : "{{#each messages}}{{#ifUser}}User: {{content}}{{/ifUser}}<end_of_utterance>\nAssistant: {{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}", "parameters": { "temperature": 0.1, "top_p": 0.95, "repetition_penalty": 1.2, "top_k": 12, "truncate": 1000, "max_new_tokens": 1024, "stop": ["<end_of_utterance>", "User:", "\nUser:"] } }
And here is my TGI command args from docker compose (using 1.3.4 container for TGI):
--sharded true --num-shard 2 --dtype float16 --model-id HuggingFaceM4/idefics-9b-instruct --max-total-tokens 2048 --max-input-length 1000 --max-batch-prefill-tokens 2048
Here is the image URL for reference.
A gigantic thank you for putting together such wonderful tools!! And thank you in advance for your help.