xenova / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
https://huggingface.co/docs/transformers.js
Apache License 2.0
11.15k stars 692 forks source link

[Model request] OwlV2 #877

Closed pachacamac closed 1 month ago

pachacamac commented 2 months ago

Model description

Thought this could be a cool add-on for the object-detection pipeline. Zero- and one-shot object detection with (as far as I understood) arbitrary (embedding based?) labels. Wdyt?

Prerequisites

Additional information

No response

Your contribution

Happy to beta test

xenova commented 1 month ago

Hi there! Luckily, OwlV2 is already supported: https://huggingface.co/models?library=transformers.js&other=owlv2&sort=trending

Example usage:

import { pipeline } from '@xenova/transformers';

const detector = await pipeline('zero-shot-object-detection', 'Xenova/owlv2-base-patch16-finetuned');

const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
const candidate_labels = ['a photo of a cat', 'a photo of a dog'];
const output = await detector(url, candidate_labels);
console.log(output);
// [
//   { score: 0.6951543688774109, label: 'a photo of a cat', box: { xmin: 326, ymin: 23, xmax: 650, ymax: 376 } },
//   { score: 0.5766839385032654, label: 'a photo of a cat', box: { xmin: 6, ymin: 63, xmax: 315, ymax: 487 } }
// ]