Open VIXIXIVIIIX opened 2 weeks ago
@oyzh-oyzh Your ViT has not been modified according to his instructions yet.
Hi @JHYsama Thanks for replying me. I am facing the same question like the issuse #18 . I do run the resize script, but when I copy the config file, there are something wrong like issuse #18 . Could you tell me how to modified the VIT correctly?
Hi, could you please check the position embedding size in your modified ViT model checkpoint? If you met the same error as issue #18 , it indicates you are still using the ViT of 224. We'll update the inference script to avoid manual modification together with the G version model later.
Thank for your replying. I just solved the problem by setting ignore_mismatched_sizes=True on llava.model.multimodal_encoder.clip_encoder
When I run those three demo
only the second Multi Categories with Multi Objects can give the right output format, but the result is wrong
When I run the first and the third command, the output like this