With a webui demo maybe more easier to try

Using command line to inference a same image with multiple questions maybe not convenient. If there is a webui that maybe more easier. I make a simple gradio webui demo for this as follow:

import gradio as gr
from PIL import Image
import tempfile
from vlmeval.config import supported_VLM

model_name = "hpt-air-demo"
model = supported_VLM[model_name]()

def question_answer(image, question):
    pil_image = Image.fromarray(image.astype('uint8'), 'RGB')
    with tempfile.NamedTemporaryFile(delete=True, suffix=".png") as f:
        pil_image.save(f.name)
        response = model.generate(prompt=question, image_path=f.name, dataset='demo')
        return response

interface = gr.Interface(fn=question_answer, inputs=["image", "text"], outputs=["textbox"])

interface.launch(server_port=8860, server_name="0.0.0.0")

The ui is like bellow:

HyperGAI / HPT

With a webui demo maybe more easier to try #5