aalto-speech / speaker-diarization

Speaker diarization scripts, based on AaltoASR
190 stars 37 forks source link

Questions about BIC distance calculation #12

Closed fhahaha closed 6 years ago

fhahaha commented 6 years ago

Hi,

I am confused about the BIC distance calculation here: d = 0.5 * N * np.log(det(S)) - 0.5 * N1 * np.log(det(S1))- 0.5 * N2 * np.log(det(S2)) As in speaker_clustering.py Line 95 and 96 .

As far as I know, BIC calculation is based on log-likelihood of each sample over its model. What's the relation between the determinant (det(S), det(S1), det(S2)) and the sample probabilities? Here is the equation I know for BIC calculation: image

I think there must be some theory background of this calculation, but I failed to make it. Could anyone help me on this question. Thanks.

antoniomo commented 6 years ago

Hi!

This particular derivation of the BIC is meant to detect a change in a particular point in time (it assumes only one change is occurring), so S, S1 and S2 are the sample covariance statistics of the total sample, and the before/after the hypothesized change. What those lines 95-96 are doing are calculating the maximum likelihood ratio statistic of a change occurring at that point, and lines 97-99 substract the proper penalty for BIC.

You can find the proper derivation in:

Chen, Scott, and Ponani Gopalakrishnan. "Speaker, environment and channel change detection and clustering via the bayesian information criterion." Proc. DARPA broadcast news transcription and understanding workshop. Vol. 8. 1998.

I'm not putting a link here because I don't know if that's allowed on the places hosting the article, but it's easy to find it on google by "speaker bayes information chen" or a similar query string :)