Open ozcelikkale opened 5 years ago
I have used overlap_rate = 0 and embedding_per_second = 0.3. By the way, If ı select embedding_per_second = 1.2 and overlap_rate = 0.4, It almost works and cover the whole audio file.
Perhaps the last mute slice was ignored.
I have tested the code with the long audio files(20+ minutes) and I have realized that the algorithm does not cover whole audio file. For instance, if audio is 35min, the code returns 34:23min audio. Also, after a point, the speaker change points slides about 2-3 seconds and then, it starts to slide 10+ seconds. Do you have any suggestion?