alipay / Ant-Multi-Modal-Framework

Research Code for Multimodal-Cognition Team in Ant Group
Creative Commons Attribution 4.0 International
60 stars 2 forks source link

Regarding the recurring problem of Dual modal Attention Enhanced Text Video Retrieval with Triplet Partial Margin Comparative Learning #7

Closed liangchild closed 1 month ago

liangchild commented 3 months ago

Hello, I would like to inquire about an issue related to replicating the study "Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning". The paper mentions that both the text and video encoders utilize CLIP, yet in the code you provided, the base.yml configuration file specifies the text encoder as BERT and does not disclose what is used for the video encoder. Could you provide the configuration file for CLIP that you used?

echojiang0830 commented 3 months ago

Hello, I would like to inquire about an issue related to replicating the study "Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning". The paper mentions that both the text and video encoders utilize CLIP, yet in the code you provided, the base.yml configuration file specifies the text encoder as BERT and does not disclose what is used for the video encoder. Could you provide the configuration file for CLIP that you used?

Thank you very much for following our related work, and we apologize for the failure of the reproduction work due to some company regulations. I must clarify the following points: Firstly, for a fair comparison with other methods (such as minimizing the impact of code framework/runtime environment on the results), the results in our paper were implemented based on the ts2net code framework(https://github.com/yuqi657/ts2_net); Secondly, due to some company regulations, the code in our paper needs to be open-source based on the current Antmmf framework; Based on the above reasons, the existing scripts are currently unable to fully reproduce the results in our paper. To reproduce the corresponding results, it is necessary to migrate them to the code framework of ts2net.