Open vpellegrain opened 1 month ago
Hi @vpellegrain -- we're in the process of revamping our support for image inputs, but @nking-1 is looking into this right now :). We should have updates on this front shortly!
Got this error from a non-vision model here.
from guidance import models
model_id = 'THUDM/glm-4-9b-chat'
glm_model = models.Transformers(model_id, device_map='auto', trust_remote_code=True)
The error message is the same as the first post here.
Same here, I got the issue while using "microsoft/Phi-3-medium-4k-instruct"
@dittops, are you trying to use a vision input for Phi-3, or just doing plain text generation? We're still working on multimodal support -- will update here when we have the image
function working again :).
@liqul -- Thanks for sharing this with us! Tagging @riedgar-ms who might be able to take a look
Hi,
I'm trying to constrain the generation of my VLMs using this repo; however i can't figure out the way to personalize the pipeline for handling inputs (query+image). Whereas it is documented as
for VertexAI models (here gemini), it is not transposable to Transformers models. Hence:
results in:
TypeError: MiniCPMV.forward() missing 1 required positional argument: 'data'
while trying with
"microsoft/Phi-3-vision-128k-instruct"
results in:ValueError: The tokenizer being used is unable to convert a special character in ’•¶∂ƒ˙∆£Ħ爨ൠᅘ∰፨.
(I also tried to manually import the model and the tokenizer and to pass them to the guidance.models call, but it does not change the error).
Is it possible to specify/personalize the pipeline for reading inputs on such models?
Thanks