BioroboticsLab / IBA

Information Bottlenecks for Attribution
MIT License
74 stars 9 forks source link

Calculation of KL divergence #57

Open Wty1122 opened 2 months ago

Wty1122 commented 2 months ago

Your work is very interesting, but I still have some confusion about the calculation of KL divergence.

In the paper, is Appendix E calculating the prior distribution of 𝑍, i.e., 𝑃(𝑍)? Since it involves λ(𝑋) and 𝑅, we need to use 𝑄(𝑍) for variational approximation?

However, in the KL divergence calculation, when calculating 𝑃(𝑍∣𝑅), λ and 𝑅 are assumed to be constants. Why can λ and 𝑅 be considered constants? Is it because λ and 𝑅 are deterministic outputs given 𝑋? Do we not need to consider the distribution of λ?