ikuinen / CMIN_moment_retrieval

Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
86 stars 20 forks source link

Question about bilinear fusion and loss caculation in the code #17

Open SCZwangxiao opened 3 years ago

SCZwangxiao commented 3 years ago

In the code, the bilinear fusion and loss function is different from that in the paper. I'm not questioning the reproducibility because I've successfully reproduced the results. I'm just wondering how much difference these modifications will make?

(CMIN/models_/gcn_final.py line 148-154)

# interactive
x1 = self.v2s(frames, x, node_mask)
frames1, x1 = self.cross_gate(frames, x1)
x = torch.cat([frames1, x1], -1)
# x = self.bilinear(frames1, x1, F.relu)
x = self.rnn(x, frames_len, self.max_num_frames)
x = F.dropout(x, self.dropout, self.training)
Starboy-at-earth commented 3 years ago

Dear SCZwangxiao:

Could you please tell how to process the splitted C3D features of Activity Caption dataset? It is weired five parts and I cannot figure out how to process such files with suffixs of .part-XX.

SCZwangxiao commented 3 years ago

Dear SCZwangxiao:

Could you please tell how to process the splitted C3D features of Activity Caption dataset? It is weired five parts and I cannot figure out how to process such files with suffixs of .part-XX.

Enter the dataset directory of *.part-XX files and type command:

cat activitynet_v1-3.part-* > temp.zip && unzip temp.zip
Starboy-at-earth commented 3 years ago

Thank you very much!!!