ChenRocks / UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
https://arxiv.org/abs/1909.11740
781 stars 109 forks source link

how to finetune in my own dataset? #55

Open wjy3326 opened 3 years ago

wjy3326 commented 3 years ago

i want to fintune the uniter model in my own dataset, how to generate the lmdb dataset for images and text? i generate the image features from faster rcnn, but how to convert the text content and image features into uniter input format? is there any code in your github show? thanks!

floschne commented 3 years ago

@wjy3326 How did you extract the images? Using the dockerfile?

wjy3326 commented 3 years ago

i use the faster rcnn in https://github.com/jwyang/faster-rcnn.pytorch, i changed some code, and use the code https://github.com/ChenRocks/BUTD-UNITER-NLVR2/blob/master/tools/generate_npz.py.

foxm79 commented 3 years ago

I am also looking for the text and img db generation but

for single npz images, I did: d = load(file_name_npz) img_feat = d['features'] bb = d['norm_bb']

for single text: toker = BertTokenizer.from_pretrained('bert-base-cased', do_lower_case=False) tokenizer = bert_tokenize(toker) ids = tokenizer(str) then pre-pended with 'CLS' token.

But don;t know how to generate a full img_db and corresponding txt_db