how do i do multimodal classification?

Hi @forkbabu ,

We would suggest you to modify run_finetuning_cls.py by taking inspiration from run_finetuning_depth.py on how to use multiple modalities for transfers. Their respective config files can be found in cfgs/finetune/cls and cfgs/finetune/depth.

When modifying the classification finetuning script to be multi-modal, make sure to also modify the various augmentations like cutmix, mixup, cropping, flipping, etc., to support multiple modalities, since these augmentations are usually crucial to get a good performance on ImageNet. Alternatively, you may also just skip these augmentations to simplify things.

Best, Roman

EPFL-VILAB / MultiMAE

how do i do multimodal classification? #17