This repository provides the codes for MMA-DFER: multimodal (audiovisual) emotion recognition method. This is an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.
Thanks a lot for this wonderful work! I wanted to test the model on new videos/images. Can you please provide any video inference?