Open songkq opened 1 year ago
Another issue in dataset.py
def hierarchical_dataset(root, opt, select_data='/', data_filtering_off=False, global_rank=0):
...
# for dirpath, dirnames, filenames in os.walk(root+'/'):
for dirpath, dirnames, filenames in os.walk(root+'/'+select_data[0]):
...
Another issue in models_mvlt.py
, it doesn't work in PyTorch==1.8.1
t_embed = torch.where(
w_mask.unsqueeze(-1).expand(-1, -1, self.decoder_embed_dim), text_mask_tokens.float(), t_embed)
RuntimeError: expected scalar type float but found c10::Half
t_embed.float()
works well.
t_embed = torch.where(
w_mask.unsqueeze(-1).expand(-1, -1, self.decoder_embed_dim), text_mask_tokens, t_embed.float())
@songkq Using len(label)+1 in the class AlignCollate and mask_idc = mask_idc + 1 in the RandomWordMaskingGenerator, because we use the mask token as separate token between visual tokens and textual tokens. About dataset.py, I don't understand what your issue is?
@onealwj Hi, I'm confusing why using
len(label)+1
instead oflen(label)
here in the classAlignCollate
when generating the random word mask?Why dose the
mask_idc
need to plus 1 here in theRandomWordMaskingGenerator
?mask_idc = mask_idc + 1
Could you please give some kind advice?