facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.5k stars 939 forks source link

Trying to work through the Bootstrap MMF Tutorial #632

Closed floschne closed 4 years ago

floschne commented 4 years ago

❓ Questions and Help

Hi, I'm currently trying to work through you tutorial on bootstrapping-a-multimodal-project-using-mmf.

I downloaded the dataset as described from https://www.drivendata.org/competitions/64/hateful-memes/data/.

But I'm encountering different problems when trying different versions of MMF:

Problem with the MMF package from pypi

Install procedure (on clean conda python 3.7 env):

pip install --upgrade --pre mmf

creating the dataset

mmf_convert_hm --zip_file ./XjiOc5ycDBRRNwbhRlgH.zip --password DontTellYou

This won't work because of the checksum so I bypass the checksum mmf_convert_hm --zip_file ./XjiOc5ycDBRRNwbhRlgH.zip --password DontTellYou --bypass_checksum=1

This leads to the following error:

Data folder is /home/p0w3r/.cache/torch/mmf/data
Zip path is ./XjiOc5ycDBRRNwbhRlgH.zip
Moving ./XjiOc5ycDBRRNwbhRlgH.zip
Unzipping ./XjiOc5ycDBRRNwbhRlgH.zip
Extracting the zip can take time. Sit back and relax.
Traceback (most recent call last):
  File "/home/p0w3r/bin/miniconda3/envs/mt_visualbert/bin/mmf_convert_hm", line 8, in <module>
    sys.exit(main())
  File "/home/p0w3r/bin/miniconda3/envs/mt_visualbert/lib/python3.7/site-packages/mmf_cli/hm_convert.py", line 165, in main
    converter.convert()
  File "/home/p0w3r/bin/miniconda3/envs/mt_visualbert/lib/python3.7/site-packages/mmf_cli/hm_convert.py", line 102, in convert
    self.assert_files(images_path)
  File "/home/p0w3r/bin/miniconda3/envs/mt_visualbert/lib/python3.7/site-packages/mmf_cli/hm_convert.py", line 34, in assert_files
    ), f"{file} doesn't exist in {folder}"
AssertionError: dev.jsonl doesn't exist in /home/p0w3r/.cache/torch/mmf/data/datasets/hateful_memes/defaults/images

When I ignore this and try to visualize some samples like in the tutorial another error occurs:

from mmf.utils.build import build_dataset

dataset = build_dataset("hateful_memes")
dataset.visualize(num_samples=8)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-1-c157f52453ce> in <module>
      1 from mmf.utils.build import build_dataset
      2 
----> 3 dataset = build_dataset("hateful_memes")
      4 dataset.visualize(num_samples=8)

~/bin/miniconda3/envs/mt_visualbert/lib/python3.7/site-packages/mmf/utils/build.py in build_dataset(dataset_key, config, dataset_type)
    103 
    104     builder_instance: mmf_typings.DatasetBuilderType = dataset_builder()
--> 105     builder_instance.build_dataset(config, dataset_type)
    106     dataset = builder_instance.load_dataset(config, dataset_type)
    107     builder_instance.update_registry_for_model(config)

~/bin/miniconda3/envs/mt_visualbert/lib/python3.7/site-packages/mmf/datasets/base_dataset_builder.py in build_dataset(self, config, dataset_type, *args, **kwargs)
     75         # Only build in main process, so none of the others have to build
     76         if is_master():
---> 77             self.build(config, dataset_type, *args, **kwargs)
     78         synchronize()
     79 

~/bin/miniconda3/envs/mt_visualbert/lib/python3.7/site-packages/mmf/datasets/builders/hateful_memes/builder.py in build(self, config, *args, **kwargs)
     56         # NOTE: This doesn't check for files, but that is a fine assumption for now
     57         assert PathManager.exists(test_path), (
---> 58             "Hateful Memes Dataset doesn't do automatic downloads; please "
     59             + "follow instructions at https://fb.me/hm_prerequisites"
     60         )

AssertionError: Hateful Memes Dataset doesn't do automatic downloads; please follow instructions at https://fb.me/hm_prerequisites

What makes sense for me since the dataset was not created correctly as indicated by the checksum error and the 1st assertion error from above.

Problem with the up-to-date version of MMF from the master branch of this repo

Install procedure (on clean conda python 3.7 env in the root directory of the cloned repo): pip install --editable .

Now I can successfully run mmf_convert_hm --zip_file ./XjiOc5ycDBRRNwbhRlgH.zip --password DontTellYou to create the dataset.

Still, there is an error when executing the code to visualize some samples with the following error message:

from mmf.utils.build import build_dataset

dataset = build_dataset("hateful_memes")
dataset.visualize(num_samples=8)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-1-c157f52453ce> in <module>
      1 from mmf.utils.build import build_dataset
      2 
----> 3 dataset = build_dataset("hateful_memes")
      4 dataset.visualize(num_samples=8)

~/gitrepos/mmf/mmf/utils/build.py in build_dataset(dataset_key, config, dataset_type)
    106     dataset_builder = registry.get_builder_class(dataset_key)
    107     assert dataset_builder, (
--> 108         f"Key {dataset_key} doesn't have a registered " + "dataset builder"
    109     )
    110 

AssertionError: Key hateful_memes doesn't have a registered dataset builder

Since I'm very new to MMF I would really appreciate you help :-)

vedanuj commented 4 years ago

Can you add this before calling the build_dataset :

from mmf.utils.env import setup_imports
setup_imports()
floschne commented 4 years ago

works like a charm :-) thank you very much! Can you recommend another tutorial where stuff like you comment get explained? :)

sha9189 commented 4 years ago

Can you recommend another tutorial where stuff like you comment get explained?

@floschne, you could read the source code to understand what setup_imports() is doing. As shown here, it is doing 2 things:

Hope this helps! :)