huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.06k stars 26.3k forks source link

Add BEiTv3 #22178

Open NielsRogge opened 1 year ago

NielsRogge commented 1 year ago

Model description

Microsoft just open-sourced BEiTv3: https://github.com/microsoft/unilm/tree/master/beit3

This is a very powerful vision-language model that can be used as backbone for a variety of downstream tasks, from image classification to VQA to object detection.

Time to add it to HF Transformers! :)

Open source status

Provide useful links for the implementation

https://github.com/microsoft/unilm/tree/master/beit3

raghavanone commented 1 year ago

I will start working on adding the model !

BakingBrains commented 1 year ago

If required I can help as well

ofettaya commented 1 year ago

hello, I want to test the zero shot Image text retrieval of the model on some images and texts, can you help me ?

princeagarwalmeesho commented 11 months ago

Is this PR going to be merged soon for Beit3 modules to be available for everyone?