Analyzing and Improving the Image Quality of StyleGAN

StyleGAN2. CVPR. 2019

Abstract

StyleGAN의 결함을 보완하였음. 모델 구조와, 훈련 방법을 수정

generator의 normalization 부분 수정
progressive growing의 재고

Introduction

styleGAN이 생성한 이미지에서 발견되는, 인공적으로 보이는 특징적인 결점이 발생

문제점

Droplet artifact
- 최종 이미지에는 잘 안보이더라도 feature map에서는 항상 존재
- 64x64 feature map을 중심으로 나타나고 점점 강하게 나타남
- 모든 StyleGAN 이미지에 존재하였고, 만일 artifact가 생기지 않는다면 아예 잘못된 이미지가 생성
- Adaln(adaptive instance normalization)에서 발생. 기존 StyleGAN에서는 Conv output feature map마 평균과 분산으로 normalization한다. 이것이 feature들 사이에 상대적인 크기에서 발견되는 정보를 파괴한다.
Phase artifacts
- progressive growing은 치아와 눈과 같은 세부 사항에 대한 location preference를 강하게 나타낸다.
- 얼굴의 각도가 조금 변하여도 치아의 배열은 유지된다.

Method

Generator architecture revisited

separate Adaln -> Normalization + modulation

Remove (simplify) how the constant is processed at the beginning.
The mean is not needed in normalizing the features.
Move the noise module outside the style module

기존 StyleGAN에서는 AdaIN이 feature map의 평균과 분산을 normalize했지만, StyleGAN2 에서는 convolution weight를 normalize한다.

AdaIN에서 평균을 제거하고 표준편차만 사용하였고, 표준편차만으로도 충분하다는 것을 알게 됨

bias와 noise를 block 외부로 빼서 style과 noise의 영향력을 독립시킴

이중에는 noise의 영향력이 style의 크기에 반비례 하였으나, noise의 변화에 따른 효과가 분명해졌다.

이로서 Droplet artifact 문제를 해결하였다.

Phase artifacts

Authors propose an alternative design that retain the benefits of progressive growing without the drawbacks
- alternative design : training starts by focusing on low-resolution images and then progressively shifts focus to higher and higher resolution without changing the network topology during training

StyleGAN은 high-resolution image를 생성하기 위해서 progressive growth idea를 이용하여 학습하였다.

StyleGAN2는 훈련을 안정적이고 깊게 하기 위해서 네트워크의 대안적 디자인을 탐색하였다. StyleGAN2는 ResNet과 비슷하게 skip connection design을 사용하였다.

Result

New network

실험을 통하여 paper에서는 (b) type model Generator, (c) type model Discriminator를 선택하였다.

(b) type model Generator 에서는 PPL을, (c) type model Discriminator 에서는 FID를 개선하였다.

Compare StyleGAN vs New network

기존 StyleGAN의 네트워크는 처음엔 저해상도 이미지를 생성하는데 집중하였고, 다음에 고해상도의 이지미를 생성하는데 집중하였다.

(b) Large networks에서 새로운 네트워크도 학습이 진행될 수록 고해상도의 기여도가 점점 증가하였다.(a) StyleGAN 과 비슷한 경향을 띈다.

이러한 네트워크를 고안하여 다음과 같은 자연스러운 image generator를 만들었다.

Result Image

StyleGAN

StyleGAN2

Conclusion

StyleGAN2는 정규화(normalization)를 개선하고 매끄러운(smooth) 잠재 공간을 위해 제약(constraints)을 추가하여 이미지 품질을 개선하였다.
8개의 GPUs (V100)을 사용했음에도, FFHQ 데이터셋을 사용할 땐, 9일, LSUN CAR 데이터셋에서는 13일이 소요됐다.
BigGANs와 다른 모델은 더 큰 모델의 효과를 입증했다.

Link

Paper
Code

Reference

1. [Deep learning 논문 읽기] StyleGAN2
2. StyleGAN2 reveiw
3. StyleGAN / StyleGAN2

doublejy715 / Paper_review