Colab example returns 0 with confidence score of 0.999462902545929 for query "look how many people love you"

🐛 Bug

So i'm trying out the example https://colab.research.google.com/github/facebookresearch/mmf/blob/notebooks/notebooks/mmf_hm_example.ipynb#scrollTo=ZKzyiRYuUMYj with pretrained model

Full log: https://gist.github.com/jiaodong/a4db1b4ab209f7cb8d92be5ef485bff6

To Reproduce

class ModelTwo:
    def __init__(self, checkpoint_path) -> None:
        from mmf.models.mmbt import MMBT

        self.model = MMBT.from_pretrained("mmbt.hateful_memes.images")

    def forward(self, image_payload_bytes, query):
        image_url = "https://i.imgur.com/tEcsk5q.jpg" #@param {type:"string"}
        text = "look how many people love you" #@param {type: "string"}
        print(self.model.classify(image_url, text))

Steps to reproduce the behavior:

1. 2. 3.

Expected behavior

Environment

Please copy and paste the output from the environment collection script from PyTorch (or fill out the checklist below manually).

You can run the script with:

# For security purposes, please check the contents of collect_env.py before running it.
python -m torch.utils.collect_env

PyTorch Version (e.g., 1.0):
OS (e.g., Linux):
How you installed PyTorch (conda, pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

facebookresearch / mmf

Colab example returns 0 with confidence score of 0.999462902545929 for query "look how many people love you" #1226

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context