facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.46k stars 932 forks source link

Pythia model and VizWiz dataset seem to be incompatible #1217

Open kenzheng99 opened 2 years ago

kenzheng99 commented 2 years ago

❓ Questions and Help

I am trying to reproduce the results of running Pythia on the VizWiz dataset, by running the following command:

mmf_run config=projects/pythia/configs/vizwiz/defaults.yaml model=pythia dataset=vizwiz run_type=val

This gives me the following error:

Traceback (most recent call last):
  File "/home/kenzheng/miniconda3/envs/mmf/bin/mmf_run", line 33, in <module>
    sys.exit(load_entry_point('mmf', 'console_scripts', 'mmf_run')())
  File "/home/kenzheng/projects/mmf/mmf_cli/run.py", line 133, in run
    main(configuration, predict=predict)
  File "/home/kenzheng/projects/mmf/mmf_cli/run.py", line 56, in main
    trainer.train()
  File "/home/kenzheng/projects/mmf/mmf/trainers/mmf_trainer.py", line 148, in train
    self.inference()
  File "/home/kenzheng/projects/mmf/mmf/trainers/mmf_trainer.py", line 166, in inference
    report, meter = self.evaluation_loop(dataset, use_tqdm=True)
  File "/home/kenzheng/projects/mmf/mmf/trainers/core/evaluation_loop.py", line 49, in evaluation_loop
    model_output = self.model(prepared_batch)
  File "/home/kenzheng/projects/mmf/mmf/models/base_model.py", line 309, in __call__
    model_output = super().__call__(sample_list, *args, **kwargs)
  File "/home/kenzheng/miniconda3/envs/mmf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/kenzheng/projects/mmf/mmf/models/pythia.py", line 294, in forward
    "image", sample_list, text_embedding_total
  File "/home/kenzheng/projects/mmf/mmf/models/pythia.py", line 241, in process_feature_embedding
    "to number of features, {}.".format(len(feature_encoders), len(features))
AssertionError: Number of feature encoders, 2 are not equal to number of features, 1.

From some other posted issues it seems that this happens because the MMF download for VizWiz doesn't include one set of image features (the ResNet152 ones?), but I couldn't find any specific answers on how to fix this.

If I'm correct that this is an issue with the MMF download, would this be able to be fixed for the future? And in the meantime, can someone explain if there is any way I can manually generate these missing features?

kyungjunlee commented 2 years ago

Hey any updates on this? I am encountering the same issue when training Pythia on Vizwiz from scratch.