I understand that in mi_firstforward, the network be to optimized is CLUBSample* which is trained to get mu and logdvar. In mi_second_forward, the core-network(the VC part) is optimized. In mi_first_forward, loglikeli is computed while in mi_second_forward mi_est is computed. I am comfused:
Is the value of loglikeli and mi_est is both used for representation disentanglement?
why loglikeli is used instead of mi_est in mi_first_forward?
I understand that in mi_firstforward, the network be to optimized is CLUBSample* which is trained to get mu and logdvar. In mi_second_forward, the core-network(the VC part) is optimized. In mi_first_forward, loglikeli is computed while in mi_second_forward mi_est is computed. I am comfused:
Is the value of loglikeli and mi_est is both used for representation disentanglement? why loglikeli is used instead of mi_est in mi_first_forward?