Closed zhimin-z closed 4 weeks ago
HEIM is for text-to-image models and VHELM is for vision-language models.
Thanks for your answer. The vision-language model can do both text-to-image and image-to-text tasks, right?
No - of the current models on listed VHELM v1.0.0, only GPT-4o supports image generation. Additionally, GPT-4o's image generation is not generally available on OpenAI's API yet.
HEIM is for text-to-image models and VHELM is for vision-language models.