jperezrua / mfas

Implementation of CVPR 2019 paper "Mfas: Multimodal fusion architecture search"
77 stars 20 forks source link

Extending the work to approaches not utilising pre-trained feature extractors #16

Open abhishektyaagi opened 3 years ago

abhishektyaagi commented 3 years ago

Hi, Congratulations on the work. It seems really intriguing. I came across a line in the paper:

However, the reader should consider that our fusion approach is in fact not limited to neural networks as primary feature extractors.

I was wondering if you could elaborate on this a little bit.

I was hoping to use a similar approach as mentioned in the paper but I don't want to restrict the search to pre-trained detectors. If I want to search for pre-fusion and post-fusion layers as well, do you think the current framework can handle that? And what would be a good starting point?