pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.43k stars 792 forks source link

Speaker Change Detection #74

Closed aiparikh closed 6 years ago

aiparikh commented 6 years ago

Hi,

I have two doubts :

  1. Is 'speech activity detection' prerequisite to 'speaker change detection'?

  2. Another question is current approach in pyannote is speaker and content invariant?

  3. In striking balance between purity and coverage, what should be the value of coverage that should be good enough practically?

Regards Ankur

hbredin commented 6 years ago
  1. No, both can be done independently from each other.
  2. Yes, as long as your training data contains lots of different speakers.
  3. It depends on what you plan to do with the resulting speech turns. If you plan to apply hierarchical agglomerative clustering, you should prefer high purity (>90%).