Open hendraet opened 4 years ago
Thanks for pointing out the useful details!
Yes you are right, in the code we didn't introduce the noise explicitly. Since Xi is a subset of images, shuffling Xi is a way to introduce noise implicitly, which is our original intuition. I agree with you that this noise injection arrow in the Figure might mislead people.
Yes, F is the concatenation of hat{Fs} and Fc along channels so as to end up with channel number 1024. In the Figure, the missing part is the Linear layer that projects 1024 channels back to 512 between F and G.
We will try to update the Figure in the next version, cheers:-)
When looking at the image of the architecture overview, I noticed two things that were reflected differently in the code.
If I am not mistaken or have missed something, would it be possible to fix those issues? Because besides those minor flaws, the graphic is really beautiful and provides a great overview of the network's architecure.