Closed felix-swadel closed 4 years ago
@felixems Transposed convolution (a.k.a deconvolution) tends to generate checkerboard artifacts. To alleviate this problem, one can use resize-convolution. A detailed description can be found below. Note that the resized-convolution has been widely used for generation tasks (e.g. SPADE, BigGAN, StarGAN v2, and StyleGAN2).
I'm attempting to train STARGAN on a series of facial images with the goal of generating new versions of each face with different skintones. The actual tone transfer is going perfectly, but the output images are not yet high enough quality. They are suffering from some blur, but more problematic is the pervasive checkerboarding I'm seeing, as well as other low frequency artifacts (see image).
Do you have any tips to alleviate this in training?
Image: https://user-images.githubusercontent.com/54915877/72378639-a610aa00-3776-11ea-8c49-c01d4beb9d21.png