ELEKTRONN / elektronn3

A PyTorch-based library for working with 3D and 2D convolutional neural networks, with focus on semantic segmentation of volumetric biomedical image data
MIT License
160 stars 27 forks source link

Calculating and visualizing effective (empirical) receptive fields of network models #14

Closed mdraw closed 5 years ago

mdraw commented 6 years ago

The receptive field of network layers (and of the whole network) informs us how much spatial context information is available to the network when predicting class probabilities. Since high (and anisotropic) spatial context-awareness is especially important when dealing with large high-res anisotropic 3D images, we should have a tool to calculate and visualize receptive fields, so we can evaluate different network architectures better.

The "theoretical" receptive field that e.g. ELEKTRONN2 calculates automatically (where it is called "fov") has been shown to be misleading:

  1. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
  2. Object Detectors Emerge in Deep Scene CNNs (section 3.2)
  3. ParseNet: Looking Wider to See Better (section 3.1 and figure 2).

2. and 3. suggest that the effective receptive field can be empirically computed and visualized by feeding crafted inputs into the network and analysing the relationship between input pixels and network activations. The method proposed in 2. looks rather effortful, whereas the approach described in 3. (section 3.1) seems to be easier to implement. There is also a project (4.) https://github.com/fornaxai/receptivefield which aims to calculate effective receptive fields with an even simpler approach (for TensorFlow and Keras models). We can't directly use this inside of elektronn3, because it is GPL-licensed, though, so writing our own implementation for PyTorch is probably the best way to go.