LinWeizheDragon / FLMR

The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
65 stars 4 forks source link

Wikipedia Corpus used for OK-VQA task #31

Open ShuaiWang97 opened 3 days ago

ShuaiWang97 commented 3 days ago

Thanks for the great work! In paper FLMR, I saw there are two external datasets Google Search Corpus and Wikipedia Corpus, used for OK-VQA task. I saw the Google Search Corpus here, but did not find Wikipedia Corpus. I am quite interested in the later one, so wanna ask can I find that data somewhere? Thanks in a dvance!

Best, Shuai

LinWeizheDragon commented 3 days ago

https://huggingface.co/datasets/BByrneLab/multi_task_multi_modal_knowledge_retrieval_benchmark_M2KR

It is released together with the M2KR benchmark. Check out the "OKVQA_data" and "OKVQA_passages".

ShuaiWang97 commented 2 days ago

Thank you very much for the prompt response. I have another question about the example given in the README about example_use_preflmr.py. image

After downloading the --image_root_dir data from here , the --dataset_path from here and --passage_dataset_path from here, I got the error of KeyError: 'ROIs' from line of code ds = ds.map(add_path_prefix_in_img_path, fn_kwargs={"prefix": args.image_root_dir}) and when I checked the dataset(showed below) it seems like there is indeed no key of "ROIs". Can you please give me some hints of this error? Thanks in advance! image

LinWeizheDragon commented 2 days ago

It seems that we did not incorporate ROIs into the wiki-version of OKVQA. You can merge the ROIs column from here. It contains pre-extracted ROI features.