ERNIE-vil的vqa任务的数据集是怎么输入的呢

PaddlePaddle / ERNIE

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

6.32k stars 1.28k forks source link

ERNIE-vil的vqa任务的数据集是怎么输入的呢 #761

Closed Takeru0618 closed 2 years ago

Takeru0618 commented 3 years ago

我从VQA官网上下载了这些数据集

想询问一下是怎么使用这些数据的

BinglengTang commented 2 years ago

代码里的输入是这样的"question_id, text, match_label, score, image_w, image_h, number_box, image_loc, image_embeddings", 前面的都可以从questions和annotations里面拿到，后面图片的使用bottom-up and top-down的特征提取器提取到的

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reopen it. Thank you for your contributions.