Open ivonajdenkoska opened 3 months ago
Hi, thanks for your recognition. The 1k validation set is the first 1k instances in share-captioner_coco_lcs_sam_1246k_1107.json
Cool, thanks! Is the code for the ShareGPT4V retrieval evaluation available?
You may refer to the 'test_epoch' function in train/train.py
.
Or, you may rewrite the dataloader in eval/retrieval/Urban1k.py
, most of the code can be reused.
Hi, are the 1k instances for eval from the SAM dataset? If so, can you tell me which data shard (tar file) you used to extract the images? Thanks a lot!
We only follow the instruction of ShareGPT4v to prepare the data and we didn't check the source of the image. You may refer to https://sharegpt4v.github.io/ for detailed information.
Hi again! We downloaded and extracted the first 50 tars of SAM (as instructed in the ShareGPT4V repo). However, there are missing images (e.g. sam/images/sa_561780.jpg, sam/images/sa_561927.jpg, /sam/images/sa_564952.jpg, etc.) when we run the eval with the first 1k instances in share-captioner_coco_lcs_sam_1246k_1107.json.
Can you share the 1k validation split you have for eval (for instance as a zip file)? Thanks in advance!
Hi, thanks again for this cool work!
Where can I locate the random 1k (image, long text) pairs separated from ShareGPT4V for long-caption image-text retrieval evaluation? Can you release this data subset? Thanks a lot!