Adding support for MobileViTV2 model

huggingface / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!

https://huggingface.co/docs/transformers.js

Apache License 2.0

11.43k stars 711 forks source link

Adding support for MobileViTV2 model #720

Closed laszlokiss-szelena closed 5 months ago

laszlokiss-szelena commented 6 months ago

Model description

Hi,

I would love to use MobileViTV2 in my application. I am definitely not an expert, but it seems that its architecture is pretty similar to MobileViT, so adding it seems fairly straightforward to me.

Laszlo

Prerequisites

[X] The model is supported in Transformers (i.e., listed here)
[ ] The model can be exported to ONNX with Optimum (i.e., listed here)

Additional information

No response

Your contribution

I experimented with this model on my fork here: https://github.com/KLaci/transformers.js/commit/e1e02b1d5876c1ffe4cadb53a01d592efee623a3

I can submit a PR too if needed.

xenova commented 6 months ago

Hi there 👋 Looks like the ONNX export isn't as simple as I originally thought (see here). Is this something you'd be able to look into? :)

xenova commented 6 months ago

Okay I might have got it working.

xenova commented 6 months ago

Example code (requires https://github.com/xenova/transformers.js/pull/721):

import { pipeline } from '@xenova/transformers';

const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
const classifier = await pipeline('image-classification', 'Xenova/mobilevitv2-1.0-imagenet1k-256', {
    quantized: false,
});
const output = await classifier(url);
// [{ label: 'tiger, Panthera tigris', score: 0.6491137742996216 }]