Open tangyc314 opened 1 year ago
"audio": { "tool": "opensmile", "sample_rate": 16000, "args": { "feature_set": "eGeMAPSv02", "feature_level": "LowLevelDescriptors", "start": null, "end": null } }, "video": { "tool": "openface", "fps": 25, "multiFace": { "enable": false, "device": "cuda:0", "facedetScale": 0.25, "minTrack": 10, "numFailedDet": 10, "minFaceSize": 1, "cropScale": 0.4 }, "average_over": 1, "args": { "hogalign": false, "simalign": false, "nobadaligned": false, "landmark_2D": true, "landmark_3D": false, "pdmparams": false, "head_pose": false, "action_units": true, "gaze": false, "tracked": false } 以以上参数提取特征,获得的vision特征向量范围不是-1到1,而是会包含一些很大的数据如5.18e+02,是有哪里需要调整吗
按照最终的维度推断,"head_pose"应该是true,但是想知道是怎样归一化的,我直接按行归一化后,模型训练的结果远没有用 simsv2数据集提供的特征结果好
请问你解决了吗?我看作者提供的论文里面视觉特征提取先用了talknet又用了openface这是怎么实现的啊
"audio": { "tool": "opensmile", "sample_rate": 16000, "args": { "feature_set": "eGeMAPSv02", "feature_level": "LowLevelDescriptors", "start": null, "end": null } }, "video": { "tool": "openface", "fps": 25, "multiFace": { "enable": false, "device": "cuda:0", "facedetScale": 0.25, "minTrack": 10, "numFailedDet": 10, "minFaceSize": 1, "cropScale": 0.4 }, "average_over": 1, "args": { "hogalign": false, "simalign": false, "nobadaligned": false, "landmark_2D": true, "landmark_3D": false, "pdmparams": false, "head_pose": false, "action_units": true, "gaze": false, "tracked": false } 以以上参数提取特征,获得的vision特征向量范围不是-1到1,而是会包含一些很大的数据如5.18e+02,是有哪里需要调整吗