code for generating the files in data_RAD

Hi, thanks for your questions! After downloading the orginal VQA_RAD dataset! You will have the "VQA_RAD Dataset Public.json" file.

To generate "trainset.json, testset.json" files, you should run the following code:


import json
pathfile = 'VQA_RAD Dataset Public.json'
data_records = json.load(open(pathfile, "r"))
test_set = []
train_set = []
count = 0

for record in data_records: sample = {} count += 1 sample['qid'] = count sample['image_name'] = record['image_name'] sample['image_organ'] = record['image_organ'] sample['answer'] = record['answer'] sample['answer_type'] = record['answer_type'] sample['question_type'] = record['question_type'] if "freeform" in record['phrase_type']: sample['question'] = record['question'] sample['phrase_type'] = "freeform" elif "para" in record['phrase_type']: sample['question'] = record['question'] sample['phrase_type'] = "para" if "test" in record['phrase_type']: test_set.append(sample.copy()) else: train_set.append(sample.copy()) if record['question_frame'] != 'NULL': count += 1 sample['qid'] = count sample['question'] = record['question_frame'] sample['phrase_type'] = "frame" train_set.append(sample.copy()) with open('trainset.json', 'w') as outfile: json.dump(train_set, outfile) with open('testset.json', 'w') as outfile: json.dump(test_set, outfile)


- To generate "images84x84, images128x128" files, you process with below steps:

  - load images (grayscale) and resize them with (84,84) and (128,128) respectively.
  - normalize them into (0,1) by dividing them by 255.
  *note the order to match the file "imgid2idx.json"

aioz-ai / MICCAI19-MedVQA

code for generating the files in data_RAD #6