This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
It is specified in the paper that, Voxceleb (which has a sampling rate of 16k) is one among the public datasets used for training the d-vector model, while the testing dataset is 2000 CALLHOME (which is 8k).
Please clarify this sampling rate mismatch issue.
It is specified in the paper that, Voxceleb (which has a sampling rate of 16k) is one among the public datasets used for training the d-vector model, while the testing dataset is 2000 CALLHOME (which is 8k). Please clarify this sampling rate mismatch issue.