huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.43k stars 426 forks source link

[New task] Visual Question Answering #980

Open xenova opened 1 year ago

xenova commented 1 year ago

Feature request

Currently, the visual-question-answering pipeline/task in transformers is not supported for onnx export:

https://github.com/huggingface/optimum/blob/618a483d463b9c2e0b0ba3b859ff74c6279b5161/optimum/exporters/tasks.py#L147-L170

Here are some models which can be used for testing: https://huggingface.co/models?other=visual-question-answering

Motivation

This was brought up as a request for Transformers.js, but I can't do any testing yet until it's supported by optimum :)

Your contribution

I'll implement it in Transformers.js once supported in optimum 👍

fxmarty commented 1 year ago

working on it

bytesandwich commented 3 weeks ago

Hi @fxmarty that's awesome! Do you have any thoughts on whether this visual-question-answering task for export is viable? I'd like to export something like dandelin/vilt-b32-finetuned-vqa or Salesforce/blip-vqa-base and they both have this type.