facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.5k stars 939 forks source link

All negative predictions from Visual BERT pretrained on COCO, for hateful memes #291

Closed andrewlee98 closed 4 years ago

andrewlee98 commented 4 years ago

šŸ› Bug

When I try to run forward() on the hateful memes validation set, the predictions are all negative. Here are the first few results:

{'scores': tensor([[ 3.3944, -3.3940], [ 4.1372, -4.5100], [ 4.0722, -4.3152], [ 3.7807, -4.3434], [ 4.1213, -4.4178], [ 0.6177, -0.1645], [ 3.9144, -4.2022], [ 3.7445, -4.4305], [ 3.9649, -4.4931], [ 4.0661, -4.4082], [ 4.0079, -4.7269], [ 2.8871, -2.6498], [ 3.8696, -4.3932], [ 3.7421, -4.0838], [ 3.7046, -4.1818], [ 3.7688, -4.0062], [ 3.9317, -4.4949], [ 3.6897, -4.1206], [ 3.6248, -3.9990], [ 4.0347, -4.5704], [ 4.0118, -4.5896], [ 3.3246, -3.0516], ...

and the rest of the results are more or less the same.

Command

I loaded the pretrained model:

from mmf.common.registry import registry

model_cls = registry.get_model_class("visual_bert")
model = model_cls.from_pretrained("visual_bert.finetuned.hateful_memes.from_coco")

tokenization code to get input_ids...

from mmf.common.sample import Sample, SampleList

input_mask = torch.tensor(np.ones(MAX_SEQ_LEN)).long()
segment_ids = torch.tensor(np.zeros(MAX_SEQ_LEN)).long()

sample_list = [Sample({'input_ids': torch.tensor(x).long(),
                      'input_mask': input_mask,
                      'segment_ids': segment_ids}) for x in x_val]
sample_list = SampleList(sample_list)

preds = model.forward(sample_list)

I tokenized the text using the bert-for-tf2 library:

from bert.tokenization.bert_tokenization import FullTokenizer tokenizer = FullTokenizer( vocab_file='pretrained_bert_model/vocab.txt') where the vocab text is downloaded from the BERT repo: https://github.com/google-research/bert

Expected behavior

I expected that the predictions would give a good AUROC (around 0.73 as shown in the paper), but it was 0.422, and all predictions were close to 0 after softmax.

Environment

PyTorch version: 1.5.0 Is debug build: No CUDA used to build PyTorch: 10.2

OS: Ubuntu 16.04.6 LTS GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609 CMake version: version 3.5.1

Python version: 3.6 Is CUDA available: No CUDA runtime version: 10.0.130 GPU models and configuration: GPU 0: Tesla K40c Nvidia driver version: 418.87.00 cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.3

Versions of relevant libraries: [pip3] numpy==1.16.6 [pip3] torch==1.5.0 [pip3] torchtext==0.5.0 [pip3] torchvision==0.6.0 [conda] Could not collect

apsdehal commented 4 years ago

Any particular reason you are not using mmf_predict to run predictions?

andrewlee98 commented 4 years ago

I would like to run the model on a different dataset, and have the flexibility to add to the model.

apsdehal commented 4 years ago

I would first test the actual mmf_run command for this pretrained model to evaluate on validation set to see if you can match the accuracy on validation set for the pretrained model provided in the paper. If you don't then we will look into what can be your issue.

Btw, your cuda isn't properly setup. Is cuda is says no for you. You probably need to install proper version of pytorch 1.5 for cuda 10.1.

gchhablani commented 4 years ago

@andrewlee98 How are you passing the visual_embedding to the VisualBERT model?

apsdehal commented 4 years ago

Closing as no response. Please open up a new issue if problem persists.