my-yy / sl_icmr2022

Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"
12 stars 1 forks source link

how were the vector-formed features extracted. #1

Closed sos1sos2Sixteen closed 1 year ago

sos1sos2Sixteen commented 1 year ago

i am wondering which specific ECAPA-TDNN and Inception model were used for the provided preprocessed dataset? Are they publicly available? and what preprocess techniques were involved in extracting these features?

my-yy commented 1 year ago

Hello, I used https://github.com/timesler/facenet-pytorch for the inception model. The ECAPA-TDNN is based on SpeechBrain. While the SpeechBrain's model is trained with vox1+vox2, thus I retrained it with vox2 only. I need some time to find the processing script and the checkpoint. If there are no accidents, I will update the repository this week.

sos1sos2Sixteen commented 1 year ago

Hello, I used https://github.com/timesler/facenet-pytorch for the inception model. The ECAPA-TDNN is based on SpeechBrain. While the SpeechBrain's model is trained with vox1+vox2, thus I retrained it with vox2 only. I need some time to find the processing script and the checkpoint. If there are no accidents, I will update the repository this week.

many many thanks for your very helpful reply!