Closed yao-jz closed 1 year ago
Could you please double check whether the files are here? I am not close to my workstation at the moment. But I think I successfully ran all training with the following. Packed pre-extracted data for both OK-VQA and F-VQA (including OCR features, VinVL object detection features, Oscar captioning features): Google Drive
I didn't use the pre-extracted data you provided.
I have generated the OCR features, VinVL object detection features, and the captions with your code.
I see. The missing files are annotations from the GS author. Could you please download the file and take out the missing files from it? I packed everything into this file at release.
I tried several times with different network but still failed to download the file. I think it is too large to be downloaded with Google Drive.
I download the retriever_train/testdev/test.json
file here.
But at the same time, I also find there is another missing file: okvqa_full_corpus_title.csv
.
Will Baiduyun work for you? If so, I can upload a copy there tomorrow.
okvqa_full_corpus_title.csv
adds a dummy "title" column to okvqa_full_corpus.csv
so that it can be processed by the script that generates the index file. It is also in the packed file.
I have already downloaded the missing file retriever_*
from another repo. Thank you very much.
There is a "kid" column in the okvqa_full_corpus.csv
(as shown below). Is that the dummy column?
kid,text
0,text
1,text
...
Here are first two rows of that file:
kid,text
passage,"about the doberman pinscher dobermans are compactly-built dogs—muscular, fast, and powerful—standing between 24 to 28 inches at the shoulder. dobermans are compactly-built dogs—muscular, fast, and powerful—standing between 24 to 28 inches at the shoulder."
passage,history: a german named louis dobermann is credited with developing the doberman pinscher breed in the late 1800s. he was a tax collector and wanted a fierce guard dog to accompany him on his rounds.
But the first two rows in okvqa_full_corpus.csv
I download:
kid,text
0,"about the doberman pinscher dobermans are compactly-built dogs—muscular, fast, and powerful—standing between 24 to 28 inches at the shoulder. dobermans are compactly-built dogs—muscular, fast, and powerful—standing between 24 to 28 inches at the shoulder."
1,history: a german named louis dobermann is credited with developing the doberman pinscher breed in the late 1800s. he was a tax collector and wanted a fierce guard dog to accompany him on his rounds.
link:https://pan.baidu.com/s/17CJ_yWdsDX3Agz4nTnYX4g password:xj3e
I think I refactored the kid column for some reason. You can just keep them and modify the processing script that generates the FAISS index. I believe that you won't have difficulty running the DPR training since it doesn't require this _title.csv
file. This file is only used in creating the FAISS index.
For your convenience, I shared the passage files above.
Thanks!
Hi, when I try to train the DPR model, I didn't find the dpr_training_annotations files in the repo.
The config in the jsonnet is
I didn't find the three json files.
Thanks!