Two branch or two loss - Githubissues

HuazhangHu commented 1 year ago

In paper, you mentioned that “To reduce conflict between the two branches, the query-video branch is trained first, followed by the query-caption branch”。However, you also mentioned that "The total loss L is the sum of Query-Video loss L{QV} and Query-Caption loss L{QC} ". Are the two branches trained separately？ My question is that what is the loss when first train query-video branch the loss and secondly train query-caption branch. In addition, how much epochs do it take to train first query-video branch.

whwu95 commented 1 year ago

In our paper, the query-video branch and the query-caption branch are trained separately. We first train the query-video branch for 5 epochs. Once this branch is trained, we proceed to train the query-caption branch.

shams2023 commented 9 months ago

在我们的论文中，查询视频分支和查询标题分支是分开训练的。我们首先训练查询视频分支 5 个周期。一旦训练了该分支，我们就继续训练查询标题分支。

我看了你的代码，我发现在train_video.py中就已经使用到了字幕caption，那么此时我该如何理解你所说的前5轮是训练查询-视频分支的？（在我的理解中，你前5个epoch为了训练查询-视频分支，那么就不该出现字幕，因为如果存在字幕，就会导致查询编码器也处理字幕信息了，那么此时不就没有所谓的前五轮训练查询-视频分支的吗？）我不知道我的理解正确不？我对着一部分很困惑，期望得到你的回复

whwu95 / Cap4Video

Two branch or two loss #4