facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.48k stars 935 forks source link

Training the Image-Region Model #268

Closed tonytang731 closed 4 years ago

tonytang731 commented 4 years ago

❓ Questions and Help

Hi,

I am currently trying to train a model with

"mmf_run config= model= dataset=hateful_memes"

and my code for Image-Region Model is

image

And it seems that we need to specify extra stuff

image

Sorry if I missed anything, and thank you very much for your help!

vedanuj commented 4 years ago

Can you run a clean pip install --upgrade --pre mmf and try it? In my environment this is working.

Also please list the steps you did before you hit this error.

tonytang731 commented 4 years ago

Thanks for your answer! I am using the second installation method "install from source". I will try the first method and see if it solves my problem.

tonytang731 commented 4 years ago

One more question:

If I am using Google Colab and the first "pip install method". Do we need to reload data every time we restart the script? Thank you!

apsdehal commented 4 years ago

If you are disconnected from the runtime for long time, it will reset and you will have to restart again.

tonytang731 commented 4 years ago

Thanks! Now I used the pip install method and loaded the data.

This time I got another error like this (sort_keys)

image

In addition, it seems that the free version of Google Colab doesn't have enough Disk Memory. Do you have any tools recommended for this competition? Thank you very much! We are currently thinking about upgrading to Google Colab Pro or AWS EC2.

image

apsdehal commented 4 years ago

YAML one is a known issue. In next release, multiple issues relating to colab will be fixed including prediction issue. For now, update version of PyYAML by running !pip install --upgrade PyYAML which will fix your this issue.

For disk, we are thinking to add a command line utility to clean up stuff that MMF caches for saving the space. For now, you can delete the features tar file that was download by rm /root/.cache/torch/mmf/data/dataset/hateful_memes/defaults/features/features.tar.gz. That should save a lot of space for you.

Once you start training, models will be checkpoint which can again take space, you can increase checkpoint interval by adding training.checkpoint_interval=5000 or any number at the end of your command.

tonytang731 commented 4 years ago

Thank you! It's very helpful.