frederickszk / LRNet

Landmark Recurrent Network: An efficient and robust framework for Deepfakes detection
MIT License
90 stars 13 forks source link

训练 #25

Open huangmengxiao000 opened 1 year ago

huangmengxiao000 commented 1 year ago
  1. Hello, may I ask, are all the settings mentioned in the paper? In preprocessing step, we adopt Dlib [14] to carry out face and landmark detection (another detector Openface[4] is adopt in the ablation study). In classification procedure, Each RNN in our two-stream network is bidirectional and Consists of GRU (Gated Recurrent Unit) whose number of output units is set to be k = 64. And two fully-connected layers with the number of units to be 64 and 2 are connected to RNN layer's output. A dropout layer with drop rate dr1 = 0.25 is inserted between input and RNN, and another 3 dropout layers with dr2 = 0.5 are used to separate the rest of the layers. These settings are partially based on existing reserach results [26]. In addition, we adopt an 8:2 dataset split, i.e., 80% for training and 20% for testing. Each video is segmented into clips with a fixed length of 60, which sum up to 2 seconds when the fps is 30. We adopt Adam optimizer with lr = 0.001, and batch size is set to be 1024. This classification model will be trained up to 500 epochs.
  2. Regarding the question of epoch, the epoch parameters are passed to the model, and in the end is it used in iteration or epoch?Why val_acc fluctuates so much?
  3. G1+g2 is to train g1 first and then train g2. After training g1, load the weight of g1 and continue to train g2, or what is the setting of this piece?
huangmengxiao000 commented 1 year ago

G1+G2 is trained by loading the weight of G1 after training G1?

huangmengxiao000 commented 1 year ago

"The prediction label of the clip is aggregated for the prediction of the video. How to understand this sentence, what is the way of aggregation? Is it averaging?

frederickszk commented 1 year ago
  1. 文章版本比较旧了,目前最新更新使用的参数可以参考:./training/config/ 下的配置文件,已经把所有超参数都统一都记录在这些配置文件中了。关于数据集的划分也可参考./training/datasets/ 的说明文档,对比论文版本有一定更新。
  2. g1和g2是分开训练的,两个是不同网络。具体可以参考./training/tain.py 中的133-147行
  3. 关于最后结果的聚合方式,是g1和g2分别的结果取平均,获得对每个片段的预测结果。然后再将所有片段的结果取加起来,获得对整个视频的预测概率。具体实现可见: https://github.com/frederickszk/LRNet/blob/408f28881e2864e8d731143849abf56f961ddefa/training/evaluate.py#L106 以及后面的代码。
huangmengxiao000 commented 1 year ago

If you choose a branch all, how to set up training?

huangmengxiao000 commented 1 year ago

If you choose all, you first train g1 to save the weight, then train g2 and then save the weight, and evaluate these two together when evaluating

huangmengxiao000 commented 1 year ago

文章中的epoch是指完整训练完一个epoch,也就是一次完整地训练完所有batch吗

frederickszk commented 1 year ago

If you choose all, you first train g1 to save the weight, then train g2 and then save the weight, and evaluate these two together when evaluating

是的,训练和测试代码中可以任意切换是否只训练 or 测试其中一条支路。如果两个都选择的情况下,就是g1训练完保存一下参数,然后训g2。两者互不干扰。

文章中的epoch是指完整训练完一个epoch,也就是一次完整地训练完所有batch吗

是的

huangmengxiao000 commented 1 year ago

代码中g1+g2的效果就是论文中g1+g2要的结果吗

huangmengxiao000 commented 1 year ago

论文中提到的k=64是在哪里设置的,In classification procedure, Each RNN in our two-stream network is bidirectional and consists of GRU (Gated Recurrent Unit) whose number of output units is set to be k = 64. 在pytorch中用的模型参数是下面这些吗,这些参数的含义是什么 feature_size: 136 lm_dropout_rate: 0.1 rnn_unit: 32 num_layers: 1 rnn_dropout_rate: 0 fc_dropout_rate: 0.5 res_hidden: 64

frederickszk commented 1 year ago

代码中g1+g2的效果就是论文中g1+g2要的结果吗

是的

论文中提到的k=64是在哪里设置的,In classification procedure, Each RNN in our two-stream network is bidirectional and consists of GRU (Gated Recurrent Unit) whose number of output units is set to be k = 64. 在pytorch中用的模型参数是下面这些吗,这些参数的含义是什么 feature_size: 136 lm_dropout_rate: 0.1 rnn_unit: 32 num_layers: 1 rnn_dropout_rate: 0 fc_dropout_rate: 0.5 res_hidden: 64

直接结合代码看即可~根据变量名推断应该都不难理解