Obscurity involved in sampling rate information of datasets used

google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

https://arxiv.org/abs/1810.04719

Apache License 2.0

1.56k stars 319 forks source link

Obscurity involved in sampling rate information of datasets used #23

Closed ronva-h closed 5 years ago

ronva-h commented 5 years ago

It is specified in the paper that, Voxceleb (which has a sampling rate of 16k) is one among the public datasets used for training the d-vector model, while the testing dataset is 2000 CALLHOME (which is 8k). Please clarify this sampling rate mismatch issue.

wq2012 commented 5 years ago

Our acoustic features, log-mel filterbank energies, are agnostic of the sampling rate.

Also, you can always downsample/upsample the signals.