Thank you for your research, I am trying to reproduce your results. I cut the DAIC-WOZ database according to your standards, then extracted the wav2vec features, and used the meta-feature file you provided for training, but the final F1 was only 0.603, Where do you think the problem may occur? In addition, when preprocessing the data, did you eliminate the voice of the virtual agent and only retain the voice of the subject?
Thank you for your research, I am trying to reproduce your results. I cut the DAIC-WOZ database according to your standards, then extracted the wav2vec features, and used the meta-feature file you provided for training, but the final F1 was only 0.603, Where do you think the problem may occur? In addition, when preprocessing the data, did you eliminate the voice of the virtual agent and only retain the voice of the subject?