Closed sTranaeus closed 4 years ago
Automated download is not enabled for coco datasets. We will add the support for it.
Thank you for the response! It's not clear to me if a change has been pushed or not. When rerunning the same command, on the latest version of mmf from source, I have the same issue. Should I be expecting a different behaviour?
For what it's worth, there is some automatic downloading already happening, and I can see a lot of COCO data present:
$ ls .cache/torch/mmf/data/datasets/coco/defaults/features/ -lh
total 161G
drwxrwxr-x 2 k1762177 k1762177 4 May 18 23:24 test2015.lmdb
-rw-rw-r-- 1 k1762177 k1762177 64G Jul 14 18:19 test2015.tar.gz
drwxr-xr-x 2 k1762177 k1762177 4 May 18 07:39 trainval2014.lmdb
-rw-rw-r-- 1 k1762177 k1762177 97G Jul 14 21:19 trainval2014.tar.gz
The specific file that is missing according to MMF is(see rest of error output in issue's first comment):
FileNotFoundError: [Errno 2] No such file or directory: '/home/k1762177/.cache/torch/mmf/data/datasets/coco/defaults/annotations/imdb_karpathy_train.npy'
Is this file also enabled for automatic download?
@sTranaeus Yes, it should be there, here is the direct link to download: https://dl.fbaipublicfiles.com/mmf/data/datasets/coco/defaults/annotations/annotations.tar.gz in case you still haven't found it.
rm -rf /home/k1762177/.cache/torch/mmf/data/datasets/coco/defaults/annotations/
cd /home/k1762177/.cache/torch/mmf/data/datasets/coco/defaults/
mkdir annotations
cd annotations
wget https://dl.fbaipublicfiles.com/mmf/data/datasets/coco/defaults/annotations/annotations.tar.gz
Then run your normal mmf command (don't extract the annotations manually).
Thank you. I've gone through those commands, and am getting a checksum error now:
AssertionError: [ Checksum for annotations.tar.gz from https://dl.fbaipublicfiles.com/mmf/data/datasets/coco/defaults/annotations/annotations.tar.gz does not match the expected checksum. Please try again. ]
Should I just keep trying again? Or is there an issue with the annotations file stored at that link?
Let me try a fresh install and get back to you.
@apsdehal still getting the error - any luck here?
@sTranaeus I did a fresh clone and ran MMF in isolation and confirmed that this works and throws an error at the stage of loading an optimizer as the one is not defined.
conda create -n mmf_test python=3.7
conda activate mmf_test
cd ~
mkdir -p test
cd test
git clone https://github.com/facebookresearch/mmf.git
cd mmf
python setup.py develop
mmf_run config=configs/datasets/coco/defaults.yaml model=visual_bert dataset=coco
This downloaded all of the features, annotations, extracted them and everything worked fine.
Also, if you are actually running VisualBERT for pretraining on COCO, you want the dataset=masked_coco
and the proper project configuration in projects/visual_bert/configs/
.
Closing, since we haven't heard back and we tested that it is working as expected. Please open up a new issue if problem persists.
❓ Questions and Help
I want to pretrain VISUALBert on COCO again, and tried running
!mmf_run config=configs/datasets/coco/defaults.yaml model=visual_bert dataset=coco
but got a FileNotFoundError(shown below). I thought it would be a mistake on my side with how I set things up, but isn'tmmf_run
meant to automatically handle setting up the data locally if it isn't there?