HyperGAI / HPT

HPT - Open Multimodal LLMs from HyperGAI
https://www.hypergai.com/
Apache License 2.0
310 stars 17 forks source link

With a webui demo maybe more easier to try #5

Open SuyongSun opened 6 months ago

SuyongSun commented 6 months ago

Using command line to inference a same image with multiple questions maybe not convenient. If there is a webui that maybe more easier. I make a simple gradio webui demo for this as follow:

import gradio as gr
from PIL import Image
import tempfile
from vlmeval.config import supported_VLM

model_name = "hpt-air-demo"
model = supported_VLM[model_name]()

def question_answer(image, question):
    pil_image = Image.fromarray(image.astype('uint8'), 'RGB')
    with tempfile.NamedTemporaryFile(delete=True, suffix=".png") as f:
        pil_image.save(f.name)
        response = model.generate(prompt=question, image_path=f.name, dataset='demo')
        return response

interface = gr.Interface(fn=question_answer, inputs=["image", "text"], outputs=["textbox"])

interface.launch(server_port=8860, server_name="0.0.0.0")

The ui is like bellow: image

QuangHyperGAI commented 4 months ago

We are working on a demo platform and will announce it soon with the future release. Thank you for your understanding!