shizhediao / DaVinci

Source code for the paper "Prefix Language Models are Unified Modal Learners"
BSD 3-Clause "New" or "Revised" License
42 stars 3 forks source link

About the ve_train/dev/test.json data #4

Open Pefect96 opened 5 months ago

Pefect96 commented 5 months ago

Where can I find the three files ve_train.json, ve_dev.json, ve_test.json? From https://github.com/CpuKnows/SNLI-VE, only get snli_ve_train.jsonl, snli_ve_dev.jsonl,snli_ve_test.jsonl, which can not be used in this code. I hope you can help me.

shizhediao commented 4 months ago

Hi, You need to adapt the downloaded dataset into our training format. Basically, put the images into a separate folder under images and put the metadata as a json file in data/.

DaVinci/
    data/
        coco_test.json
        coco_train.json
        coco_val.json
        *.json

    images/
        coco/
            train2014/*.jpg
            val2014/*.jpg
            test2015/*.jpg

        visualgenome/
            image/*.jpg

        nlvr2/
            images/
                train/0-99/*.png
            dev/*.png
            test1/*.png

The json file contains a list. Each item in the list is a dictionary with two key-value pairs: {'binary': bs64_encoding_of_the_image, 'caption': text_of_image}.