Open liu66-xf opened 7 months ago
Sorry for the late reply. In our implementation, we use UCF101 and CK+ to pretrain the network with their frames from samples. We consider UCF101 as the 'temporal' prior knowledge because the dataset contains more motion video information rather than facial information. We regard CK+ as 'spatial' knowledge because this dataset has a shorter temporal length but abundant facial emotion information. We will release the network weights later in our repository. Thank you for your attention to our work.
潘老师,能麻烦问下文章中这句话“The initial weights of the STA-DRN are pretrained parameters on UCF101 for the temporal prior of videos, and on CK+ for the spatial prior of facial images.“应该怎么去理解吗?还有这里的预训练权重的文件方便提供参考吗?