taylorlu / Speaker-Diarization

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Apache License 2.0
455 stars 124 forks source link

Sliding for long audios #21

Open ozcelikkale opened 4 years ago

ozcelikkale commented 4 years ago

I have tested the code with the long audio files(20+ minutes) and I have realized that the algorithm does not cover whole audio file. For instance, if audio is 35min, the code returns 34:23min audio. Also, after a point, the speaker change points slides about 2-3 seconds and then, it starts to slide 10+ seconds. Do you have any suggestion?

ozcelikkale commented 4 years ago

I have used overlap_rate = 0 and embedding_per_second = 0.3. By the way, If ı select embedding_per_second = 1.2 and overlap_rate = 0.4, It almost works and cover the whole audio file.

taylorlu commented 4 years ago

Perhaps the last mute slice was ignored.