self.audio_semantic_decoder and self.Audio_decoder

haihuangcode / CMG

The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)

167 stars 2 forks source link

Hello, thank you for your interest in our work.

The code only uses feature reconstruction in the loss function, classification is there but does not contribute to the loss. This is primarily because we are doing unsupervised pretrain, so we left that part in case we want to extend it later.

We did not actually use a transformer, the UniEncoder code was from an earlier attempt we tried but the results were not very satisfactory. So it was just left in the code but not really used.

Not all released code is useful, some are previous attempts or discarded code, and the core code is in pretrain.py main_model_2.py CPC.py models.py CLUB.py.

haihuangcode / CMG

self.audio_semantic_decoder and self.Audio_decoder #2