linhdvu14 / vggvox-speaker-identification

Speaker identification with VGGVox network
82 stars 34 forks source link

where cfg/enroll_list.csv is ? #3

Closed zhengqun closed 5 years ago

zhengqun commented 5 years ago

FileNotFoundError: File b'cfg/enroll_list.csv' does not exist

linhdvu14 commented 5 years ago

You need to create the cfg files. The format is:

filename,speaker [path to wav 1],[true speaker 1] [path to wav 2],[true speaker 2]

etc.

zhengqun commented 5 years ago

Do you train /test voxceleb1 dataset in you NN model ?

linhdvu14 commented 5 years ago

Model weights are ported from the pretrained model (see #1 ) which was trained on Vox1. You can run the evaluation script (scoring.py) on any data.

zhengqun commented 5 years ago

Ok Thank you very much. Could you give me your three csv file , enroll test and result_csv ?i want to see。Thanks for your attention, I am looking forward to your early reply.I'm a UCAS master. THANK YOU

发自我的 iPhone

在 2018年11月18日,17:10,Linh Vu notifications@github.com 写道:

Model weights are ported from the pretrained model (see #1 ) which was trained on Vox1. You can run the evaluation script (scoring.py) on any data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

SwapnilBorse123 commented 5 years ago

@linhdvu14 Hey can you please share the three csv files as requested by @zhengqun? I am kind of running in the same problem. I am trying to get a scoring on a wav file where Scarlett Johanson is the speaker. So my test_list.csv and enroll_list contains the text below:

filename,speaker ./wav_files/test_sample.wav,Scarlett Johanson

But, when I try to run this code, get_fft_spectrum code runs into an error on the line rsize = max(k for k in buckets if k <= fft_norm.shape[1]) saying that max() function is getting an empty sequence as an argument.

Am I doing anything wrong? May I request you a sample of the three working csv files? That would be a great help. Thanks a lot.

zhengqun commented 5 years ago

Sorry the writer doesn't share me three .csv file. But you can write it reference voxceleb dataset There are four files in it. And you can also reference voxceleb.csv.

发自我的 iPhone

在 2018年11月29日,06:47,Swapnil Borse notifications@github.com 写道:

@linhdvu14 Hey can you please share the three csv files as requested by @zhengqun? I am kind of running in the same problem. I am trying to get a scoring on a wav file where Scarlett Johanson is the speaker. So my test_list.csv and enroll_list contains the text below:

filename,speaker ./wav_files/test_sample.wav,Scarlett Johanson

But, when I try to run this code, get_fft_spectrum code runs into an error on the line rsize = max(k for k in buckets if k <= fft_norm.shape[1]) saying that max() function is getting an empty sequence as an argument.

Am I doing anything wrong? May I request you a sample of the three working csv files? That would be a great help. Thanks a lot.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

SwapnilBorse123 commented 5 years ago

@zhengqun thank you so much for a quick reply. Can you please share your three files? Also do you have any idea how to this work on a wav file with one of the celebrite voices from the voxceleb celebrities list. With reference to my earlier comment: https://github.com/linhdvu14/vggvox-speaker-identification/issues/3#issuecomment-442634946 my output looks like test_file,test_speaker,Scarlett Johanson,result,correct ./wav_files/audio1.wav,Scarlett Johanson,4.7628567756419216e-14,Scarlett Johanson,1.0

zhengqun commented 5 years ago

Oh,i am so sorry ,I was traveling in the field(on a business trip). 发自我的 iPhone

在 2018年11月29日,06:58,Swapnil Borse notifications@github.com 写道:

@zhengqun thank you so much for a quick reply. Can you please share your three files? Also do you have any idea how to this work on a wav file with one of the celebrite voices from the voxceleb celebrities list. With reference to my earlier comment: https://github.com/linhdvu14/vggvox-speaker-identification/issues/3#issuecomment-442634946 my output looks like test_file,test_speaker,Scarlett Johanson,result,correct ./wav_files/audio1.wav,Scarlett Johanson,4.7628567756419216e-14,Scarlett Johanson,1.0

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

linhdvu14 commented 5 years ago

@SwapnilBorse123 @zhengqun please see latest commit.

zhengqun commented 5 years ago

Thank you very much.

发自我的 iPhone

在 2018年11月30日,11:57,Linh Vu notifications@github.com 写道:

@SwapnilBorse123 @zhengqun please see latest commit.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

SwapnilBorse123 commented 5 years ago

@linhdvu14 Thank you so much for your efforts. I could understand the file content from your code and made your code work for my project with some modifications to suit my need. I am able to annotate superhero's name correctly when he/she is speaking for any marvel movie. Thanks a million for your fantastic efforts. I will give work-credits to you for sure in my project.