airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
934 stars 158 forks source link

number of data characters(457269) cannot be 1 more than a multiple of 4 #70

Open YangYL18 opened 4 years ago

YangYL18 commented 4 years ago

Hi, Thank you for your codes, but when I run the command bash run/vqa_finetune.bash 1 vqa_lxr955 something wrong, the error infomation: Load 632117 data from split ( s ) train , nominival Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/train2014_obj36.tsv Traceback (most recent call last): File "src/tasks/vqa.py", line 178, in vqa=VQA() File "src/tasks/vqa.py", line 37, in init args.train,bs=args.batch_size,shuffle=True,drop_last=True File"src/tasks/vqa.py",line22,inget_data_tuple tset = VQATorchDataset(dset) File "/data2/192202018/code/Lxmert-master/src/tasks/vqa_data.py", line 102, in_init topk=load_topk)) File "/data2/192202018/code/lxmert-master/src/utils.py", line 45, in load_obj_tsv item[key] = np.frombuffer(base64.b64decode(item[key]), dtype=dtype) File "/home/y192202018/anaconda3/envs/ganomaiy/lib/python3.7/base64.py", line 87, in b64decode return binascii.a2b_base64(s) binascii.Error:Invaid bases 4-en coded string : number of data characters ( 457269 ) cannot be 1 more than a multiple of 4. Can you give me some advise? Thank you very much!

airsplay commented 4 years ago

Emmmm, not sure. Maybe trying reading files with option 'rb'?

yezhengli-Mr9 commented 3 years ago

How are train2014_obj36.tsv and val2014_obj36.tsv from mscoco_imgfeat generated? In case I want to run on my personal datasets with personal images.

yezhengli-Mr9 commented 3 years ago

How are train2014_obj36.tsv and val2014_obj36.tsv from mscoco_imgfeat generated? In case I want to run on my personal datasets with personal images.

The discussion is extended in issue#79.