CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Is there any way to get access to the script used for training CNN rather than using the pretrained CNNs for French speakers? This would help the researchers evaluate the model for other native speakers. Thank you.
I did not release the training code for several reasons:
the training code is ugly and would require a substantial amount of work to be released in open-source
The model was trained using a private dataset, that cannot be released. Consequently, it would be difficult to reproduce the training in the same conditions.
I believe that the choice of a "best" DNN architecture (depth, number of neurons, etc...) is dependent on the training dataset properties (amount of data, available meta data, class distributions). For this reason, I do not believe my proposal would be the best one using a different training dataset.
Dear Concerned,
Is there any way to get access to the script used for training CNN rather than using the pretrained CNNs for French speakers? This would help the researchers evaluate the model for other native speakers. Thank you.