Conditioned on the output local features networkUnlike existing approaches using typical conditional networks(pix2pix), we combine a cGAN with a pretrained local features network, i.e., both generator and discriminzator network are only conditioned on the output of local features network to increase the generalization ability over authentic line arts.
local feature의 효과가 와닿지 않음
ResNeXT, dilated conv, WGAN-GP 사용we fuse the conditional framework with WGAN-GP and perceptual loss as the criterion in the GAN training stage. This allows us to robustly train a network with more capacity, thus makes the synthesized images more natural and real.
dataset제공Moreover, we collected two cleaned datasets with high quality color illustrations and hand-drawn line arts. They provide a stable training data source as well as a test benchmark for line art colorization. by training with the proposed illustration dataset and adding minimal augmentation, our model can handle general anime line arts with stroke color hints
Content loss (perceptual loss)To penalize color/structual mismatch between the output of generator and ground truth, we adopted perceptual loss as our content loss
ill2vec을 사용한다면?
hintWe trade off between them and use randomly sampled points in 4x downsampled scale to simulate stroke-based inputs with the intuition that each color strokes tend to have uniform color value and dense spatial information.
augmentationTo take non-black sketches into account, every sketch image is randomly scaled to x^ = 1 - r(1-x), where r is sampled from an uniform distribution U(0.7,1). We resize the image pairs with shortest sides to be 512 and then randomly crop to 512x512 before random horizontal flipping.
AlacGAN [User-Guided Deep Anime Line Art Colorization with Conditional Adversarial Networks]
paper url : https://arxiv.org/pdf/1808.03240.pdf
Contribution
Conditioned on the output local features network
Unlike existing approaches using typical conditional networks(pix2pix), we combine a cGAN with a pretrained local features network, i.e., both generator and discriminzator network are only conditioned on the output of local features network to increase the generalization ability over authentic line arts.
ResNeXT, dilated conv, WGAN-GP 사용
we fuse the conditional framework with WGAN-GP and perceptual loss as the criterion in the GAN training stage. This allows us to robustly train a network with more capacity, thus makes the synthesized images more natural and real.
dataset제공
Moreover, we collected two cleaned datasets with high quality color illustrations and hand-drawn line arts. They provide a stable training data source as well as a test benchmark for line art colorization. by training with the proposed illustration dataset and adding minimal augmentation, our model can handle general anime line arts with stroke color hints
Content loss (perceptual loss)
To penalize color/structual mismatch between the output of generator and ground truth, we adopted perceptual loss as our content loss
hint
We trade off between them and use randomly sampled points in 4x downsampled scale to simulate stroke-based inputs with the intuition that each color strokes tend to have uniform color value and dense spatial information.
augmentation
To take non-black sketches into account, every sketch image is randomly scaled to x^ = 1 - r(1-x), where r is sampled from an uniform distribution U(0.7,1). We resize the image pairs with shortest sides to be 512 and then randomly crop to 512x512 before random horizontal flipping.
다른 기법과의 연관성