astra-vision / CoMoGAN

CoMoGAN: continuous model-guided image-to-image translation. CVPR 2021 oral.
Apache License 2.0
180 stars 14 forks source link

equation (7) in the paper #3

Closed NguyenTriTrinh closed 3 years ago

NguyenTriTrinh commented 3 years ago

Hi, I think your work is really interesting! I have a question about the equation(7) in the paper, the h^Y and h^y_M are summed up by three kinds of features, respectively, but in the codes they are summed up by four kinds of features . Did i misunderstand something?

https://github.com/cv-rits/CoMoGAN/blob/dd3824715152f6464a95c99dd6f936744992b122/networks/backbones/comomunit.py#L145

fabvio commented 3 years ago

No, you're right. Actually, thanks a lot for the issue, I think this could be specified better so I'll include a note in the readme.

We experimented some architectures for the DRB and we noticed that adding some residual connection after FIN improved stability of training. Hence, we can formalize the h^\phi feature as the composition of the output of two residuals, of which one has FIN layers. This both allows for continuous encoding of features and better training of the network. We omitted this detail for the sake of simplicity. To be clearer, we could formalize the output features as

# h^Y_M = h^E_M + h^\phi + h^x
physical_output_features = physical_features + (continuous_features + common_features) + input_features

If it's clear I'm closing the issue, otherwise I'm keeping it open to discuss.

NguyenTriTrinh commented 3 years ago

I get it, thx~