Closed zzk2021 closed 5 months ago
然后有关roi的部分,是不是还需要用vinvl提取
Yes, the code shows how to fine-tune the retriever, and then fine-tune the blip while freezing the retriever. The ROI features can be extracted following the instructions in the README. You can also replace it with any other object detection model if you don't need to strictly reproduce the numbers in the paper.
Dear author, I want to know the format of file './output/[train/val/test]_predictions.json' , my work run failed when take the step path/to/azcopy copy 'https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption ./oscar_dataset --recursive
, I commit a issue on the website "https://github.com/microsoft/Oscar" but I still don't know how to fix it.
Do you want to get the predictions for OKVQA or custom datasets? In the repo, there is a zip file containing all pre extracted files.
Do you want to get the predictions for OKVQA or custom datasets? In the repo, there is a zip file containing all pre extracted files.
Thanks!I want to know the format so I can manufacturing my custom datasets. I think zip file can solve my problem.
我理解的这个微调是分两步骤吗,第一步微调检索器,第二部微调blip?