Add BEiTv3 - Githubissues

NielsRogge commented 1 year ago

Model description

Microsoft just open-sourced BEiTv3: https://github.com/microsoft/unilm/tree/master/beit3

This is a very powerful vision-language model that can be used as backbone for a variety of downstream tasks, from image classification to VQA to object detection.

Time to add it to HF Transformers! :)

Open source status

[X] The model implementation is available
[X] The model weights are available

Provide useful links for the implementation

https://github.com/microsoft/unilm/tree/master/beit3

raghavanone commented 1 year ago

I will start working on adding the model !

BakingBrains commented 1 year ago

If required I can help as well

ofettaya commented 1 year ago

hello, I want to test the zero shot Image text retrieval of the model on some images and texts, can you help me ?

princeagarwalmeesho commented 11 months ago

Is this PR going to be merged soon for Beit3 modules to be available for everyone?

huggingface / transformers

Add BEiTv3 #22178

Model description

Open source status

Provide useful links for the implementation