luo3300612 / image-captioning-DLCT

Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
BSD 3-Clause "New" or "Revised" License
194 stars 31 forks source link

About features on Baiduyun disk #4

Closed john2019-warwick closed 3 years ago

john2019-warwick commented 3 years ago

Hello, could you explain about the features displayed on net disk? is coco_all_align.hdf5 in the zip file? And what are the files end with z01, z02,z03? I have tried to exact the zip file, but fails. feature

luo3300612 commented 3 years ago

please try to extract on Windows OS

john2019-warwick commented 3 years ago

Thanks, I ve already solved this issue, but when it goes to line 138 for training: https://github.com/luo3300612/image-captioning-DLCT/blob/11cff4c24636a1b54974750889842a1a424c7060/train.py#L138 reward = torch.from_numpy(reward).to(device).view(detections.shape[0], beam_size) there is an error: 07032021 This happens when RL comes in, I think view can't be used because reward before line 138 is a scalar (size1), and it should be enlarged into a vector or matrice with 100 numbers first then it can be changed to 5*20 by view function. Is there any idea for how to fix it according to the context?

luo3300612 commented 3 years ago

please refer to issue#5