Issue in performing inference only on V model

preet021 commented 3 years ago

I was trying to perform inference on VisualBERT model while I encountered the following error when running the bash file bash/inference/V/hm_V45.sh (its the same error with other two seeds as well):

  File "hm.py", line 393, in <module>
    main()
  File "hm.py", line 374, in main
    dump=os.path.join(args.output, '{}_{}.csv'.format(args.exp, split))
  File "hm.py", line 269, in predict
    for i, datum_tuple in enumerate(loader):
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
    return self._process_data(data)
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
    data.reraise()
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home2/preet.thakkar/anaconda3/envs/i3_vl/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/scratch/preet.thakkar/vilio/fts_lmdb/hm_data.py", line 79, in __getitem__
    img = self.process_img(iid)
  File "/scratch/preet.thakkar/vilio/fts_lmdb/hm_data.py", line 59, in process_img
    f = self.id2file[iid]
KeyError: 15740

This error occurs when the predict function in hm.py is called for test_unseen. This is giving a key error probably because the features extracted (in vilio/data/features/) using the command bash ./bash/inference/V/hm_VLMDB.sh do not include the 2K files of test_unseen.

Muennighoff commented 3 years ago

Yeah that looks like it didn't get the lmdb file which includes the test_unseen features, but only the one for train, dev, test_seen

Did you download them from here: https://dl.fbaipublicfiles.com/mmf/data/datasets/hateful_memes/defaults/features/features.tar.gz ?

preet021 commented 3 years ago

Yes, its the same link which is present in SCORE_REPRO.md

Muennighoff commented 3 years ago

Yes, its the same link which is present in SCORE_REPRO.md

Can you try with these lmdb files & check if it works: https://www.kaggle.com/muennighoff/hmfeatureszipfin

preet021 commented 3 years ago

Yes, its the same link which is present in SCORE_REPRO.md

Can you try with these lmdb files & check if it works: https://www.kaggle.com/muennighoff/hmfeatureszipfin

Okay will try with the new link. Also will inference-only produce the same results as with training? i.e. is the inference done on baseline models?

Muennighoff commented 3 years ago

Yes, its the same link which is present in SCORE_REPRO.md

Can you try with these lmdb files & check if it works: https://www.kaggle.com/muennighoff/hmfeatureszipfin

Okay will try with the new link. Also will inference-only produce the same results as with training? i.e. is the inference done on baseline models?

Yes it should produce the same results within +-2% absolute due to ERNIE-Vil being a bit variable

preet021 commented 3 years ago

Yes, its the same link which is present in SCORE_REPRO.md

Can you try with these lmdb files & check if it works: https://www.kaggle.com/muennighoff/hmfeatureszipfin

Downloading features from this link worked. Thanks

Muennighoff commented 3 years ago

Yes, its the same link which is present in SCORE_REPRO.md

Can you try with these lmdb files & check if it works: https://www.kaggle.com/muennighoff/hmfeatureszipfin

Downloading features from this link worked. Thanks

Great thanks for finding this. It seems like FB has changed its link back to the Phase I features. I will update the link in the README to use those. Closing this for now, feel free to reopen if needed.

Muennighoff / vilio

Issue in performing inference only on V model #3