jperezrua / mfas

Implementation of CVPR 2019 paper "Mfas: Multimodal fusion architecture search"
77 stars 20 forks source link

MM_IMDB Searchable and AV-MNIST Dataset #9

Closed Somedaywilldo closed 4 years ago

Somedaywilldo commented 4 years ago

Dear Author,

Thanks for this work! I'm trying to reproduce the result, first I want to know if av-mnist a public dataset? Because I can't find it. So I'm trying to use mmimdb. And got some questions in addition to #8 :

  1. I still have some question about preparing the mmimdb, for the image sizes are different, did you crop it or pad before converting .jpeg to .npy?
  2. I didn't see a searchable class specialized for mmimdb, does that mean I should just use the ModelSearcher() for it?
  3. Also there seems to be 27 classes in mmimdb, not "23" in the paper. Counter({'Drama': 13967, 'Comedy': 8592, 'Romance': 5364, 'Thriller': 5192, 'Crime': 3838, 'Action': 3550, 'Adventure': 2710, 'Horror': 2703, 'Documentary': 2082, 'Mystery': 2057, 'Sci-Fi': 1991, 'Fantasy': 1933, 'Family': 1668, 'Biography': 1343, 'War': 1335, 'History': 1143, 'Music': 1045, 'Animation': 997, 'Musical': 841, 'Western': 705, 'Sport': 634, 'Short': 471, 'Film-Noir': 338, 'News': 64, 'Adult': 4, 'Talk-Show': 2, 'Reality-TV': 1}) Maybe the last four classes are excluded?

Sincerely, Somedaywilldo

vielzeuf commented 4 years ago

Hi,

  1. From what I remember, we follow the exact GMU paper procedure "Since all the images do not have the same size, all images were scaled, and cropped when required, to160×256pixels keeping the aspect ratio".
  2. I let the answer to Juan Manuel, but I think that you need to implement the mmimdb_searchable class.
  3. Again we follow the original GMU paper pipeline, evaluating the model on the 23 genres provided in their paper (https://arxiv.org/pdf/1702.01992.pdf). So yes it means excluding the last four classes.
Somedaywilldo commented 4 years ago

That's really helpful, thank you!