Open karndeepsingh opened 2 years ago
Hi, I want to train a Multi-modal using Image and Text for Multi-label classification.
Can you please help me to understand what latest multi-modal are available that takes image and text as an input and fine-tune on my classification task.
Looking forward to your reply.
thanks
Hi, you can find a list of multi-modal models implemented in this codebase here
Hi, I want to train a Multi-modal using Image and Text for Multi-label classification.
Can you please help me to understand what latest multi-modal are available that takes image and text as an input and fine-tune on my classification task.
Looking forward to your reply.
thanks