BioroboticsLab / IBA

Information Bottlenecks for Attribution
MIT License
74 stars 9 forks source link

Question about KL-div #18

Closed geonyeong-park closed 4 years ago

geonyeong-park commented 4 years ago

Hello, got many inspiration from your work and thank you for sharing the codes. I got confused of information-loss codes below: https://github.com/BioroboticsLab/IBA/blob/34baed689b6a6f6e528a329d5386281dbba28dee/IBA/pytorch.py#L401-L410

Q1. Why mean and variance employed in codes are different with ones in paper: appendix E? Q2. Why Z is normalized? I guess there's no such normalization part included. image

Thank you

geonyeong-park commented 4 years ago

Closed due to clear explanation in code for paper results:

We normalize r to simplify the computation of the KL-divergence

    #
    # The equation in the paper is:
    # Z = λ * R + (1 - λ) * ε)
    # where ε ~ N(μ_r, σ_r**2)
    #  and given R the distribution of Z ~ N(λ * R, ((1 - λ) σ_r)**2)
    #
    # In the code μ_r = self.mean and σ_r = self.std.
    #
    # To simplify the computation of the KL-divergence we normalize:
    #   R_norm = (R - μ_r) / σ_r
    #   ε ~ N(0, 1)
    #   Z_norm ~ N(λ * R_norm, (1 - λ))**2)
    #   Z =  σ_r * Z_norm + μ_r
    #
    # We compute KL[ N(λ * R_norm, (1 - λ))**2) || N(0, 1) ].
    #
    # The KL-divergence is invariant to scaling, see:
    # https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Properties