Closed NielsRogge closed 2 years ago
Tagging @mishig25 for the widget
Also LXMERT should handle this task, but likely has a very different API.
This sounds amazing. Happy to contribute in anyway I can
I'd love to pick this up!
Hey @sijunhe, I'm just starting out in open-source, but I'd like to help out however I can!
@sabarish-srinivasan appreciate the help but I saw this a little late and I am almost done with the PR.
@sijunhe No problem, thanks for letting me know!
@LysandreJik I looked at both ViLT and LXMERT and I don't think it's possible to combine these two into a single pipeline for the following reasons:
Yes, don't think we should support LXMERT for the pipeline, since it isn't entirely included in the Transformers library.
Sounds good, let's go with ViLT then!
Now that #17286 is merged, this issue should be closed now?
Yes :) Thank you for your contribution @sijunhe!
Feature request
We currently have ViLT in the library, which, among other tasks, is capable of performing visual question answering (VQA).
It would be great to have a pipeline for this task, with the following API:
This pipeline could default to the https://huggingface.co/dandelin/vilt-b32-finetuned-vqa checkpoint. Also check out the Space that showcases the model.
This can be implemented similar to other pipelines. For an example PR that added a pipeline, see #11598.
Motivation
A pipeline is required in order to have inference widgets + a task defined at hf.co/tasks.
Moreover, it would be great to do VQA in two lines of code.
Your contribution
I can definitely assist in this, together with @Narsil, who's the pipeline expert.