The last term, p * 0.5 * log(N), should be in the sum only once IMHO. It is in the top BIC equation (j is the model index, not the cluster index), not in the l(Dn) equation where n is the cluster index) in https://web.cs.dal.ca/~shepherd/courses/csci6403/clustering/xmeans.pdf No guarantees that everything else is fine.
I also rename sigma_sqrt to sigma_sq because it is supposed to be sigma square, not square root.
Note that if sigma_multiplier = float('-inf'), the result will always be infinity, won't it?
The last term,
p * 0.5 * log(N)
, should be in the sum only once IMHO. It is in the top BIC equation (j is the model index, not the cluster index), not in the l(Dn) equation where n is the cluster index) in https://web.cs.dal.ca/~shepherd/courses/csci6403/clustering/xmeans.pdf No guarantees that everything else is fine.I also rename
sigma_sqrt
tosigma_sq
because it is supposed to be sigma square, not square root.Note that if
sigma_multiplier = float('-inf')
, the result will always be infinity, won't it?