omni-us / research-GANwriting

Source code for ECCV20 "GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images"
MIT License
67 stars 24 forks source link

Errors in Architecture Overview #7

Open hendraet opened 4 years ago

hendraet commented 4 years ago

image

When looking at the image of the architecture overview, I noticed two things that were reflected differently in the code.

  1. The noise that is added to is not present in the code. Am I missing something here?
  2. The cubes that represent the shape of are misleading because they imply that when merging and the channel dimension changes. However, the linear layer here halves the number of channels of the combined feature maps. This way the number of channels is .

If I am not mistaken or have missed something, would it be possible to fix those issues? Because besides those minor flaws, the graphic is really beautiful and provides a great overview of the network's architecure.

leitro commented 4 years ago

Thanks for pointing out the useful details!

  1. Yes you are right, in the code we didn't introduce the noise explicitly. Since Xi is a subset of images, shuffling Xi is a way to introduce noise implicitly, which is our original intuition. I agree with you that this noise injection arrow in the Figure might mislead people.

  2. Yes, F is the concatenation of hat{Fs} and Fc along channels so as to end up with channel number 1024. In the Figure, the missing part is the Linear layer that projects 1024 channels back to 512 between F and G.

We will try to update the Figure in the next version, cheers:-)