Closed lycutter closed 3 years ago
Hi, thanks for your questions! After downloading the orginal VQA_RAD dataset! You will have the "VQA_RAD Dataset Public.json" file.
import json
pathfile = 'VQA_RAD Dataset Public.json'
data_records = json.load(open(pathfile, "r"))
test_set = []
train_set = []
count = 0
for record in data_records: sample = {} count += 1 sample['qid'] = count sample['image_name'] = record['image_name'] sample['image_organ'] = record['image_organ'] sample['answer'] = record['answer'] sample['answer_type'] = record['answer_type'] sample['question_type'] = record['question_type'] if "freeform" in record['phrase_type']: sample['question'] = record['question'] sample['phrase_type'] = "freeform" elif "para" in record['phrase_type']: sample['question'] = record['question'] sample['phrase_type'] = "para" if "test" in record['phrase_type']: test_set.append(sample.copy()) else: train_set.append(sample.copy()) if record['question_frame'] != 'NULL': count += 1 sample['qid'] = count sample['question'] = record['question_frame'] sample['phrase_type'] = "frame" train_set.append(sample.copy()) with open('trainset.json', 'w') as outfile: json.dump(train_set, outfile) with open('testset.json', 'w') as outfile: json.dump(test_set, outfile)
- To generate "images84x84, images128x128" files, you process with below steps:
- load images (grayscale) and resize them with (84,84) and (128,128) respectively.
- normalize them into (0,1) by dividing them by 255.
*note the order to match the file "imgid2idx.json"
Hello, I am interested in VQA in medical field and your work is amazaing! But I am confused that how to generate the files such as trainset.json, testset.json images84x84, images128x128 in data_RAD directory with the original data. Could you plz share the code for this? Thank you!