facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.46k stars 932 forks source link

Colab example returns 0 with confidence score of 0.999462902545929 for query "look how many people love you" #1226

Closed jiaodong closed 2 years ago

jiaodong commented 2 years ago

🐛 Bug

So i'm trying out the example https://colab.research.google.com/github/facebookresearch/mmf/blob/notebooks/notebooks/mmf_hm_example.ipynb#scrollTo=ZKzyiRYuUMYj with pretrained model

Full log: https://gist.github.com/jiaodong/a4db1b4ab209f7cb8d92be5ef485bff6

To Reproduce

class ModelTwo:
    def __init__(self, checkpoint_path) -> None:
        from mmf.models.mmbt import MMBT

        self.model = MMBT.from_pretrained("mmbt.hateful_memes.images")

    def forward(self, image_payload_bytes, query):
        image_url = "https://i.imgur.com/tEcsk5q.jpg" #@param {type:"string"}
        text = "look how many people love you" #@param {type: "string"}
        print(self.model.classify(image_url, text))

Steps to reproduce the behavior:

1. 2. 3.

Expected behavior

Environment

Please copy and paste the output from the environment collection script from PyTorch (or fill out the checklist below manually).

You can run the script with:

# For security purposes, please check the contents of collect_env.py before running it.
python -m torch.utils.collect_env

Additional context