I'm reference the code here. The mutual information calculation is disc_mi_est = disc_ent - disc_cross_ent. If you reference disc_ent, it is the negative mean of disc_log_q_c, where disc_log_q_c = self.model.reg_disc_latent_dist.logli_prior(disc_reg_z).
However, for Categorical distributions, the logli_prior does not merely use the prior uniform probabilities, but instead something like: SUM_i{x_i * log(p_i)}. Shouldn't it just be SUM_i{p_i * log(p_i}? Put simply, if H(C) is a constant, why is it being passed data? It appears the entropy method would work as expected if it were passed the prior dist_info.
I'm reference the code here. The mutual information calculation is
disc_mi_est = disc_ent - disc_cross_ent
. If you referencedisc_ent
, it is the negative mean ofdisc_log_q_c
, wheredisc_log_q_c = self.model.reg_disc_latent_dist.logli_prior(disc_reg_z)
.However, for Categorical distributions, the logli_prior does not merely use the prior uniform probabilities, but instead something like:
SUM_i{x_i * log(p_i)}
. Shouldn't it just beSUM_i{p_i * log(p_i}
? Put simply, if H(C) is a constant, why is it being passed data? It appears theentropy
method would work as expected if it were passed the prior dist_info.