ChenRocks / UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
https://arxiv.org/abs/1909.11740
777 stars 109 forks source link

txt_db preprocessing code for VQA #67

Closed tejas-gokhale closed 3 years ago

tejas-gokhale commented 3 years ago

Hi could you also release prepro.py for VQA? I have my own alternative questions and answers for VQA images and I want to test UNITER (and VILLA) on these.

It seems that prepro.py only has a process_nlvr2() function -- but not for VQA

estelleafl commented 3 years ago

Hi @tejas-gokhale ! Have you finally found out how to get the prepro for VQA? Thank you!

tejas-gokhale commented 3 years ago

Hello @estelleafl yes, with some minimal edits to process_nlvr() we were able to do it. I am pasting a snippet from my code that you can try: @ChenRocks you can add this to your codebase too if you find it useful. Cheers!

def process_vqa(annotation_file, db, tokenizer, missing=None):

    with open(annotation_file, 'r') as infile:
        data = json.load(infile)

    id2len = {}
    txt2img = {}  # not sure if useful
    img2txts = {}

    for example in tqdm(data, desc='processing VQA'):
        id_ = str(example['question_id'])
        img_fname = example['img_id'] + '.npz'
        img_fname = img_fname.replace("COCO", "coco")
        if missing and (img_fname[0] in missing or img_fname[1] in missing):
            continue
        input_ids = tokenizer(example['sent'])
        example['target'] = dict()
        example['target']['labels'] = list(example['label'].keys())
        example['target']['scores'] = list(example['label'].values())

        txt2img[id_] = img_fname
        id2len[id_] = len(input_ids)
        if img_fname in img2txts:
            img2txts["img_fname"].append(id_)
        else:
            img2txts["img_fname"] = id_
        example['input_ids'] = input_ids
        example['img_fname'] = img_fname

        db[id_] = example

    return id2len, txt2img, img2txts
estelleafl commented 3 years ago

Thanks a lot!

sri13 commented 7 months ago

@tejas-gokhale Hi Tejas, do you have finetune model of UNITER for VQA ? I'm not able to find it in @ChenRocks azure blob container like https://acvrpublicycchen.blob.core.windows.net/uniter/finetune/nlvr-base.tar. I'm looking to run inference script. Any help is greatly appreciated.