microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Other
1.59k stars 202 forks source link

So...has anyone actually managed to get this running with results similar to the original paper? #36

Open abhisuri97 opened 11 months ago

abhisuri97 commented 11 months ago

I got to the point that the gradio interface was running (with some heavy editing of the gradio script since a majority of the interface was actually hidden); however, when I ran I got an error on the chat interface similar to #17 Digging further into the logs, I see this error when attempting to evaluate on a single image:

image

Anyway, I'm inclined to say that this paper is not reproducible as it stands unless someone has managed to get a demo running. Even a huggingface space would be helpful to others seeking to verify results.

yihp commented 10 months ago

I encountered the same problem, have you solved it?

yihp commented 10 months ago

@abhisuri97 Hello, I also encountered the same problem, have you solved it?