Open huangmengxiao000 opened 1 year ago
G1+G2 is trained by loading the weight of G1 after training G1?
"The prediction label of the clip is aggregated for the prediction of the video. How to understand this sentence, what is the way of aggregation? Is it averaging?
./training/config/
下的配置文件,已经把所有超参数都统一都记录在这些配置文件中了。关于数据集的划分也可参考./training/datasets/
的说明文档,对比论文版本有一定更新。./training/tain.py
中的133-147行If you choose a branch all, how to set up training?
If you choose all, you first train g1 to save the weight, then train g2 and then save the weight, and evaluate these two together when evaluating
文章中的epoch是指完整训练完一个epoch,也就是一次完整地训练完所有batch吗
If you choose all, you first train g1 to save the weight, then train g2 and then save the weight, and evaluate these two together when evaluating
是的,训练和测试代码中可以任意切换是否只训练 or 测试其中一条支路。如果两个都选择的情况下,就是g1训练完保存一下参数,然后训g2。两者互不干扰。
文章中的epoch是指完整训练完一个epoch,也就是一次完整地训练完所有batch吗
是的
代码中g1+g2的效果就是论文中g1+g2要的结果吗
论文中提到的k=64是在哪里设置的,In classification procedure, Each RNN in our two-stream network is bidirectional and consists of GRU (Gated Recurrent Unit) whose number of output units is set to be k = 64. 在pytorch中用的模型参数是下面这些吗,这些参数的含义是什么 feature_size: 136 lm_dropout_rate: 0.1 rnn_unit: 32 num_layers: 1 rnn_dropout_rate: 0 fc_dropout_rate: 0.5 res_hidden: 64
代码中g1+g2的效果就是论文中g1+g2要的结果吗
是的
论文中提到的k=64是在哪里设置的,In classification procedure, Each RNN in our two-stream network is bidirectional and consists of GRU (Gated Recurrent Unit) whose number of output units is set to be k = 64. 在pytorch中用的模型参数是下面这些吗,这些参数的含义是什么 feature_size: 136 lm_dropout_rate: 0.1 rnn_unit: 32 num_layers: 1 rnn_dropout_rate: 0 fc_dropout_rate: 0.5 res_hidden: 64
直接结合代码看即可~根据变量名推断应该都不难理解