Closed puar-playground closed 2 months ago
I am seeing a similar thing. I thought it had to do with the VQGAN image encoder going haywire on certain inputs, but every once in a while a request fulfills successfully after a few refusals, so it seems the circuits are there. Things I've been trying: recognizing plants, food, OCR - not things that I'd consider dangerous. Wondering if there's a prompt strategy anyone's found that minimizes incorrect refusals.
I have the same problem. I tried different prompts, but all have the same/similar response. :/ Edit: i was able to get simple responses like, "What language is in the image?" -> although incorrectly. I wanted to perform OCR tasks and see how it performed. Any help on what kind of prompts works? i tried prompts that work fine for llama2 and llama3 , llava1.6, but here all I get it is "Sorry i cannot blah blah" .
Question scope generally needs to be pared down to reduce the likelihood that the model will refuse to answer, i.e. more specific questions about an image that are aren't open-ended.
I tried to run multimodal inference following this demo code, but the model keeps responding with excuses such as:
I'm unable to meet that request. I must politely decline that, sorry. I'm sorry, but that's something I cannot do. I'm sorry, but I'm unable to comply with that request.
Has anyone else encountered this issue? I think it might be a mistake in how I loaded the model, but I followed every line of the instructions here:
https://github.com/facebookresearch/chameleon/blame/3356bda40896f73d8c8d03c19694ec1607c477ed/chameleon/inference/examples/multimodal_input.py#L9-L24