thuiar / MMSA-FET

A Tool for extracting multimodal features from videos.
GNU General Public License v3.0
131 stars 20 forks source link

想获取一下SIMSv2这个数据集的特征提取的配置 #31

Open tangyc314 opened 10 months ago

tangyc314 commented 10 months ago

"audio": { "tool": "opensmile", "sample_rate": 16000, "args": { "feature_set": "eGeMAPSv02", "feature_level": "LowLevelDescriptors", "start": null, "end": null } }, "video": { "tool": "openface", "fps": 25, "multiFace": { "enable": false, "device": "cuda:0", "facedetScale": 0.25, "minTrack": 10, "numFailedDet": 10, "minFaceSize": 1, "cropScale": 0.4 }, "average_over": 1, "args": { "hogalign": false, "simalign": false, "nobadaligned": false, "landmark_2D": true, "landmark_3D": false, "pdmparams": false, "head_pose": false, "action_units": true, "gaze": false, "tracked": false } 以以上参数提取特征,获得的vision特征向量范围不是-1到1,而是会包含一些很大的数据如5.18e+02,是有哪里需要调整吗

tangyc314 commented 9 months ago

按照最终的维度推断,"head_pose"应该是true,但是想知道是怎样归一化的,我直接按行归一化后,模型训练的结果远没有用 simsv2数据集提供的特征结果好