mfederici / Multi-View-Information-Bottleneck

Implementation of Multi-View Information Bottleneck
125 stars 17 forks source link

Shared parameters between encoder_1 and encoder_2 #3

Closed feng-bao-ucsf closed 3 years ago

feng-bao-ucsf commented 3 years ago

Dear MIB authors

Thank you for releasing this great and clear implementation. I have questions regarding to the definition of encoder_1 and encoder_2. From the code these two encoders share the same weights. But from the paper, these two were parameterized by theta and psi, respectively. Maybe I get it wrong. Can you help me with this?

Best, Feng

mfederici commented 3 years ago

Dear Feng, You are correct, in the example that we reported on our GitHub the two encoders (encorer_1 and encoder_2) completely share their parameters, as explained in the last paragraph of section 3.3. This is possible because the two views have the same marginal distribution p(v_1)=p(v_2).

If your two views of interest are different from each other, you can just initialize encoder_2 (with parameters psi) with a different architecture and add its parameters to the optimizer. In general, parameter sharing is beneficial for training faster and get a better gradient estimation, so if your two views have something in common you can also use partial parameter sharing.

I hope this helps to clarify your doubts. Best,

Marco

feng-bao-ucsf commented 3 years ago

Dear Feng, You are correct, in the example that we reported on our GitHub the two encoders (encorer_1 and encoder_2) completely share their parameters, as explained in the last paragraph of section 3.3. This is possible because the two views have the same marginal distribution p(v_1)=p(v_2).

If your two views of interest are different from each other, you can just initialize encoder_2 (with parameters psi) with a different architecture and add its parameters to the optimizer. In general, parameter sharing is beneficial for training faster and get a better gradient estimation, so if your two views have something in common you can also use partial parameter sharing.

I hope this helps to clarify your doubts. Best,

Marco

Hi Marco, thanks for the clarification.