Fully Convolutional Networks for Semantic Segmentation

Abstract

key insight - “fully convolutional” = FCN 논문의 최초 논문이라할수 있음.
segmentation task
- alexnet, vgg, resnet등에서 이를 이용하여 segmentation task를 푸는게 가능
- 사실 이 그림 하나로 설명이 다 됨
- 따라서, base 구현은 그닦 어렵지 않을듯~
  Fully convolutional networks
각 convnet는 feature map 즉, c x w x h 차원
- spatial information = wxh
- feature = c
Convnets are built on translation invariance
- Their basic components (convolution, pooling, and activation functions) operate on local input regions, and depend only on relative spatial coordinates
기존 이미지분류에 쓰이는 alexnet/vgg등에서 맨 뒤에 존재하는 fully connected layer를 제거하고 convolution layer로 대체 (1x1)하는 형태 = convolutionalization
- object에 대한 위치 정보를 보존된다.다음그림이 인터넷에..더 좋음.
- 실제 이를 원본크기로 re-scale하면됨
  - bi-linear interpolation 가능하지만, 학습상에서는 deconvolution을 이용한듯.. 하지만 upsampling할때 최근에 bi-linear interpolation도 많이 이용하는듯~
실제로 skip-connection을 이용하여 더 성능 향상