cshizhe / hgr_v2t

Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".
MIT License
209 stars 21 forks source link

A question about mpdata.py and rolesgraph.py in reader folder #19

Open sherlockfeng1995 opened 3 years ago

sherlockfeng1995 commented 3 years ago

There is a doubt in this get data function: why only obtain one caption in a video ?

def getitem(self, idx): out={} if self.is_train: video_idx,cap_idx=self.pair_idxs[idx] video_name=self.video_names[video_idx] mp_feature=self.mp_features[video_idx] sent=self.captions[cap_idx] cap_ids,cap_len=self.process_sent(sent,self.max_words_embedding) out['captions_ids']=cap_ids
out['captions_lens']=cap_len else: video_name=self.video_names[idx] mp_feature=self.mp_features[idx]

    out['names']=video_name
    out['mp_fts']=mp_feature

    return out