Open 12cyan opened 1 year ago
I‘ve noticed that before. I suggest you can simply filtering the three types of index in dataloader.py with:
trainfile = open('DIOR_RSVG/' + 'train.txt', "r").readlines()
testfile = open('DIOR_RSVG/' + 'test.txt', "r").readlines()
valfile = open('DIOR_RSVG/' + 'val.txt', "r").readlines()
trainIndex = [int(index.strip('\n')) for index in trainfile]
testIndex = [int(index.strip('\n')) for index in testfile]
valIndex = [int(index.strip('\n')) for index in valfile]
count = 0
if self.split == "train":
Index = trainIndex
elif self.split == "val":
Index = [i for i in valIndex if i not in trainIndex]
elif self.split == "test":
Index = [i for i in testIndex if i not in trainIndex + valIndex]
THEN You'll get 15328 samples for train, 2311 for validate, 10234 for test, which would be enough for verifying your own model. You can also do verification on RefCOCOunc, which is a representative dataset for VG task.
For example, both test.txt and train.txt have the number 6