CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
MIT License
736
stars
126
forks
source link
Adding new segmenter using VBx-based model + Xvectors are now computed after voice activity detection + new test added #77