Closed TungWg closed 2 years ago
Hi, they are generated from the original refcoco+ annotation, by converting the bounding boxes into patch masks.
Thank you for your reply, dose original refcoco+ annotation mean instances. json and refs(unc).p? Are these json files or any converting scripts available?
Here is the code snippet for the conversion. I used the official refer API to load the original annotation.
import torch
import os
import json
from refer import REFER
import numpy as np
data_root = './data' # contains refclef, refcoco, refcoco+, refcocog and images
dataset = 'refcoco+'
splitBy = 'unc'
refer = REFER(data_root, dataset, splitBy)
split = 'test'
ref_ids = refer.getRefIds(split=split)
annotations = []
dim_w, dim_h = 384, 384
patch_size = 32
n_patch_w, n_patch_h = dim_w//patch_size, dim_h//patch_size
refer.getRefIds()
for ref_id in ref_ids:
ref = refer.Refs[ref_id]
image = refer.Imgs[ref['image_id']]
width, height = image['width'], image['height']
w_step = width/n_patch_w
h_step = height/n_patch_h
patch_area = height*width/(n_patch_w*n_patch_h)
mask = refer.getMask(ref)['mask']
patch = []
for i in range(n_patch_h):
for j in range(n_patch_w):
y0 = max(0,round(i*h_step))
y1 = min(height, round((i+1)*h_step))
x0 = max(0,round(j*w_step))
x1 = min(width, round((j+1)*w_step))
submask = mask[int(y0):int(y1),int(x0):int(x1)]
patch.append(submask.sum()/patch_area)
text = [sentence['sent'] for sentence in ref['sentences']]
imgPath = os.path.join('/export/share/datasets/vision/coco/images/train2014', image['file_name'])
annotation = {'image': imgPath, 'text':text, 'patch':patch, 'type':'ref', 'ref_id':ref['ref_id']}
annotations.append(annotation)
for ann in annotations:
ann['patch'] = [torch.Tensor(ann['patch']) for n in range(len(ann['text']))]
torch.save(annotations,'refcoco+_%s.pth'%split)
OK, thanks a lot!
Hi, thanks for the excellent work. I would like to know how to generate these json file refcoco+_train.json, refcoco+_val.json, refcoco+_train.json, refcoco+_test.json in data.tar.gz. How to get those json files for refcoco and Refcocog datasets?