Closed lihengtao closed 1 year ago
It's a good question! The forward process (Eq. 2 in the paper, i.e., adding noise to the input image) is stochastic, so adding different randomly-sampled noise and then average-pool the extracted features could further stabilize and slightly boost the performance. This operation is also mentioned in the last paragraph of Sec. 4.2. You can also set to 1 though if you find it doesn't make too much difference in your downstream task.
I've got that. Thanks for your reply!
Hi! Thanks for your great work. Here I don't understand why the input image is repeated for 8 times. Can
ensemble_size
be modified to 1?