Open NielsRogge opened 1 year ago
I will start working on adding the model !
If required I can help as well
hello, I want to test the zero shot Image text retrieval of the model on some images and texts, can you help me ?
Is this PR going to be merged soon for Beit3 modules to be available for everyone?
Model description
Microsoft just open-sourced BEiTv3: https://github.com/microsoft/unilm/tree/master/beit3
This is a very powerful vision-language model that can be used as backbone for a variety of downstream tasks, from image classification to VQA to object detection.
Time to add it to HF Transformers! :)
Open source status
Provide useful links for the implementation
https://github.com/microsoft/unilm/tree/master/beit3