Open zappy586 opened 1 month ago
Mmgrounding seems really promising for few-shot object detection. But the early modality fusion makes the architecture very confusing. Has anyone tried to convert this model into a few-shot learner or has any ideas on how to do it?
Mmgrounding seems really promising for few-shot object detection. But the early modality fusion makes the architecture very confusing. Has anyone tried to convert this model into a few-shot learner or has any ideas on how to do it?