microsoft / Oscar

Oscar and VinVL
MIT License
1.04k stars 251 forks source link

Getting "No such file or directory: 'datasets/coco_caption/coco-train-words.p'" when training model with SCST loss optimization #87

Closed gsrivas4 closed 3 years ago

gsrivas4 commented 3 years ago

I am training latest Oscar code for SCST optimization as per the instructions given in this section - https://github.com/microsoft/Oscar/blob/master/MODEL_ZOO.md#image-captioning-on-coco. However, I am getting below error:

Traceback (most recent call last):
  File "oscar/run_captioning.py", line 1009, in <module>
    main()
  File "oscar/run_captioning.py", line 984, in main
    last_checkpoint = train(args, train_dataloader, val_dataloader, model, tokenizer)
  File "oscar/run_captioning.py", line 455, in train
    baseline_type=args.sc_baseline_type,
  File "/home/default/ephemeral_drive/work/image_captioning/Oscar_13april/oscar/utils/caption_evaluate.py", line 119, in __init__
    self.CiderD_scorer = CiderD(df=cider_cached_tokens)
  File "/home/default/ephemeral_drive/work/image_captioning/Oscar_13april/oscar/utils/cider/pyciderevalcap/ciderD/ciderD.py", line 28, in __init__
    self.cider_scorer = CiderScorer(n=self._n, df_mode=self._df)
  File "/home/default/ephemeral_drive/work/image_captioning/Oscar_13april/oscar/utils/cider/pyciderevalcap/ciderD/ciderD_scorer.py", line 80, in __init__
    pkl_file = cPickle.load(open(df_mode,'rb'), **(dict(encoding='latin1') if six.PY3 else {}))
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/coco_caption/coco-train-words.p'

I have looked into the files that I should download and I did not find coco-train-words.p present in any of the recommended downloads given in this section - https://github.com/microsoft/Oscar/blob/master/DOWNLOAD.md#datasets. I downloaded coco_caption dataset at this location - https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip. However, this the dataset download for coco_caption does not has the file coco-train-words.p. Also, I have been using an older version of Oscar code and I did not have this issue with that version - https://github.com/microsoft/Oscar/tree/9b07b6735267c0eb30fc2efd482d4295d13c7b4f. Please let me know how I can download the coco-train-words.p file or if anyone has guidelines about how to resolve this issue.

gsrivas4 commented 3 years ago

I found coco-train-words.p file by downloading coco_caption from this link - https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md#datasets. However, the new download link did not have train.feature.tsv file, which I copied from the download from this location - https://github.com/microsoft/Oscar/blob/master/DOWNLOAD.md#datasets.