Closed Hpjhpjhs closed 1 year ago
No downsampling is used for pre-training all the networks used in this work. Downsampling is only used during the fine-tuning process
@Martlgap Thank you for reply. So the comparison in Table 1 of this paper is between the pretrained model and pretrained model with data downsampling and oct loss. Do you measure the benefits from the data downsampling?
@Hpjhpjhs yes, you are right. The octuplet loss requires data downsampling.
The BT-M model of this paper (https://arxiv.org/pdf/2107.03769.pdf) is probably what you mean by measuring the benefits from data downsampling? We measured the benefit for ArcFace with simply training with CE loss but use downsampled data.
@Martlgap Thanks for the reference. But I have a practical issue, whether you remove the last partial FC and use the output features of the backbones to be as the input of triplet loss compytation?
@Hpjhpjhs we extract the features for our loss from the bottleneck layer, which is in our case a 512 fc layer right after the backbone. The last FC which is used for pre-training with cross entropy is removed for our fine-tuning.
@Hpjhpjhs, does this answer your question?
Thanks for your work But I'm still puzzled about the data preprocessing for the pretrained face model. Apart from the finetuning process, is the data downsampling used to pretrain the face model?