ubclaunchpad / minutes

:telescope: Speaker diarization via transfer learning
https://medium.com/ubc-launch-pad-software-engineering-blog/speaker-diarisation-using-transfer-learning-47ca1a1226f4
27 stars 5 forks source link

Handle Varying Sample Rates #113

Open chadlagore opened 6 years ago

chadlagore commented 6 years ago

When a user passes in a folder of audio files, minutes loads all of them in, concatenates them, then reshapes them into phrases. There is an edge case where some of the audio in the folder has a different sample rate. We need to handle that case or throw errors.

https://github.com/ubclaunchpad/minutes/blob/cf7deaae0add9300c2f5af849771884b85314e47/minutes/audio.py#L25-L27

chadlagore commented 6 years ago

Elsewhere...

If a user reads in audio files with different sample rates, our observation lengths begin to differ, and we cannot learn. One fix for this is to resample the audio to a common rate (say 44000, or some user specified number).