xenova / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
https://huggingface.co/docs/transformers.js
Apache License 2.0
9.82k stars 579 forks source link

Any plans to add moondream and build a demo? Xenova/moondream2 #743

Closed BChip closed 1 month ago

BChip commented 1 month ago

Model description

I found https://huggingface.co/Xenova/moondream2 has been created.

Is there plans to add moondream2 in v3 and has anyone started a demo yet?

Prerequisites

Additional information

from transformers import AutoModelForCausalLM, AutoTokenizer
from PIL import Image

model_id = "vikhyatk/moondream2"
revision = "2024-04-02"
model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, revision=revision
)
tokenizer = AutoTokenizer.from_pretrained(model_id, revision=revision)

image = Image.open('<IMAGE_PATH>')
enc_image = model.encode_image(image)
print(model.answer_question(enc_image, "Describe this image.", tokenizer))

Your contribution

Please let me know if you need any help on this. I am looking forward to having a tiny VLM available in transformers.js! :hug

xenova commented 1 month ago

Hi there 👋 Indeed, this is on our list :) The main issue is that the WebGPU version is still pretty slow, but now that we have Phi-3 running w/ WebGPU (demo), you should be seeing a Moondream demo soon. 🤞

xenova commented 1 month ago

It's out! https://huggingface.co/spaces/Xenova/experimental-moondream-webgpu

https://github.com/xenova/transformers.js/assets/26504141/18c37e8b-ffa3-4cba-a824-201bfe53673a

See the model card for usage instructions.