Open wjy3326 opened 3 years ago
@wjy3326 How did you extract the images? Using the dockerfile?
i use the faster rcnn in https://github.com/jwyang/faster-rcnn.pytorch, i changed some code, and use the code https://github.com/ChenRocks/BUTD-UNITER-NLVR2/blob/master/tools/generate_npz.py.
I am also looking for the text and img db generation but
for single npz images, I did: d = load(file_name_npz) img_feat = d['features'] bb = d['norm_bb']
for single text: toker = BertTokenizer.from_pretrained('bert-base-cased', do_lower_case=False) tokenizer = bert_tokenize(toker) ids = tokenizer(str) then pre-pended with 'CLS' token.
But don;t know how to generate a full img_db and corresponding txt_db
i want to fintune the uniter model in my own dataset, how to generate the lmdb dataset for images and text? i generate the image features from faster rcnn, but how to convert the text content and image features into uniter input format? is there any code in your github show? thanks!