mfederici / Multi-View-Information-Bottleneck

Implementation of Multi-View Information Bottleneck
125 stars 17 forks source link

resampling in MIB #8

Open LindgeW opened 2 weeks ago

LindgeW commented 2 weeks ago

Hi, thanks for your nice work.

I have a question about your resampling trick in MIB. It seems that the parameters \mu and \sigma obtained from a neural encoder are free to learn without any constraints. As you know, for both VAE and VIB, they are generally restricted to a standard normal distribution (as prior distribution). I wonder whether such unconstrained optimization for \mu and \sigma is reasonable.

https://github.com/mfederici/Multi-View-Information-Bottleneck/blob/296fbb9b3827522ae46290eeeb849ef28e9ded73/utils/modules.py#L30

mfederici commented 2 weeks ago

Thank you for expressing interest in our work! We considered fixing sigma and learning only mu during the optimization phase. This works but the results are generally slightly worse. My intuition is that, with fixed sigma, the model cannot express varying degrees of uncertainty for different data points.

LindgeW commented 2 weeks ago

Thank you for expressing interest in our work! We considered fixing sigma and learning only mu during the optimization phase. This works but the results are generally slightly worse. My intuition is that, with fixed sigma, the model cannot express varying degrees of uncertainty for different data points.

Yeah, that's true. Have you tried to let the learned \mu and \sigma in MIB be restricted to follow a standard normal distribution N(0, 1) like VAE? The parameters \mu and \sigma (>0) for your MIB seem to be free to learn without any specialized constraints.