Open sherlockfeng1995 opened 3 years ago
There is a doubt in this get data function: why only obtain one caption in a video ?
def getitem(self, idx): out={} if self.is_train: video_idx,cap_idx=self.pair_idxs[idx] video_name=self.video_names[video_idx] mp_feature=self.mp_features[video_idx] sent=self.captions[cap_idx] cap_ids,cap_len=self.process_sent(sent,self.max_words_embedding) out['captions_ids']=cap_ids out['captions_lens']=cap_len else: video_name=self.video_names[idx] mp_feature=self.mp_features[idx]
out['names']=video_name out['mp_fts']=mp_feature return out
There is a doubt in this get data function: why only obtain one caption in a video ?
def getitem(self, idx): out={} if self.is_train: video_idx,cap_idx=self.pair_idxs[idx] video_name=self.video_names[video_idx] mp_feature=self.mp_features[video_idx] sent=self.captions[cap_idx] cap_ids,cap_len=self.process_sent(sent,self.max_words_embedding) out['captions_ids']=cap_ids
out['captions_lens']=cap_len else: video_name=self.video_names[idx] mp_feature=self.mp_features[idx]