feat: Added experimental support of receptive field estimation

As suggested in #12, this PR adds support for receptive field estimation by:

computing layer-specific effective receptive field, stride and padding
aggregating them for compatibility with the max_depth attribute
adding corresponding unittests on receptive field estimation
adding a boolean argument to the summary function, which adds the receptive field information to the console summary (default: False)

Here are a few samples:

from torchvision.models import vgg16
from torchscan import summary
summary(vgg16().eval().cuda(), (3, 224, 224), receptive_field=True)

yields

__________________________________________________________________________________________________________
Layer                        Type                  Output Shape              Param #         Receptive field
==========================================================================================================
vgg                          VGG                   (-1, 1000)                0               212            
├─features                   Sequential            (-1, 512, 7, 7)           0               212            
|    └─0                     Conv2d                (-1, 64, 224, 224)        1,792           212            
|    └─1                     ReLU                  (-1, 64, 224, 224)        0               210            
|    └─2                     Conv2d                (-1, 64, 224, 224)        36,928          210            
|    └─3                     ReLU                  (-1, 64, 224, 224)        0               208            
|    └─4                     MaxPool2d             (-1, 64, 112, 112)        0               208            
|    └─5                     Conv2d                (-1, 128, 112, 112)       73,856          104            
|    └─6                     ReLU                  (-1, 128, 112, 112)       0               102            
|    └─7                     Conv2d                (-1, 128, 112, 112)       147,584         102            
|    └─8                     ReLU                  (-1, 128, 112, 112)       0               100            
|    └─9                     MaxPool2d             (-1, 128, 56, 56)         0               100            
|    └─10                    Conv2d                (-1, 256, 56, 56)         295,168         50             
|    └─11                    ReLU                  (-1, 256, 56, 56)         0               48             
|    └─12                    Conv2d                (-1, 256, 56, 56)         590,080         48             
|    └─13                    ReLU                  (-1, 256, 56, 56)         0               46             
|    └─14                    Conv2d                (-1, 256, 56, 56)         590,080         46             
|    └─15                    ReLU                  (-1, 256, 56, 56)         0               44             
|    └─16                    MaxPool2d             (-1, 256, 28, 28)         0               44             
|    └─17                    Conv2d                (-1, 512, 28, 28)         1,180,160       22             
|    └─18                    ReLU                  (-1, 512, 28, 28)         0               20             
|    └─19                    Conv2d                (-1, 512, 28, 28)         2,359,808       20             
|    └─20                    ReLU                  (-1, 512, 28, 28)         0               18             
|    └─21                    Conv2d                (-1, 512, 28, 28)         2,359,808       18             
|    └─22                    ReLU                  (-1, 512, 28, 28)         0               16             
|    └─23                    MaxPool2d             (-1, 512, 14, 14)         0               16             
|    └─24                    Conv2d                (-1, 512, 14, 14)         2,359,808       8              
|    └─25                    ReLU                  (-1, 512, 14, 14)         0               6              
|    └─26                    Conv2d                (-1, 512, 14, 14)         2,359,808       6              
|    └─27                    ReLU                  (-1, 512, 14, 14)         0               4              
|    └─28                    Conv2d                (-1, 512, 14, 14)         2,359,808       4              
|    └─29                    ReLU                  (-1, 512, 14, 14)         0               2              
|    └─30                    MaxPool2d             (-1, 512, 7, 7)           0               2              
├─avgpool                    AdaptiveAvgPool2d     (-1, 512, 7, 7)           0               1              
├─classifier                 Sequential            (-1, 1000)                0               1              
|    └─0                     Linear                (-1, 4096)                102,764,544     1              
|    └─1                     ReLU                  (-1, 4096)                0               1              
|    └─2                     Dropout               (-1, 4096)                0               1              
|    └─3                     Linear                (-1, 4096)                16,781,312      1              
|    └─4                     ReLU                  (-1, 4096)                0               1              
|    └─5                     Dropout               (-1, 4096)                0               1              
|    └─6                     Linear                (-1, 1000)                4,097,000       1              
==========================================================================================================
Trainable params: 138,357,544
Non-trainable params: 0
Total params: 138,357,544
----------------------------------------------------------------------------------------------------------
Model size (params + buffers): 527.79 Mb
Framework & CUDA overhead: 504.26 Mb
Total RAM usage: 1032.05 Mb
----------------------------------------------------------------------------------------------------------
Floating Point Operations on forward: 30.96 GFLOPs
Multiply-Accumulations on forward: 15.47 GMACs
Direct memory accesses on forward: 15.52 GDMAs
__________________________________________________________________________________________________________

and using max_depth option:

____________________________________________________________________________________________________________
Layer                        Type                  Output Shape              Param #         Receptive field
============================================================================================================
vgg                          VGG                   (-1, 1000)                0               212            
├─features                   Sequential            (-1, 512, 7, 7)           14,714,688      212            
├─avgpool                    AdaptiveAvgPool2d     (-1, 512, 7, 7)           0               1              
├─classifier                 Sequential            (-1, 1000)                123,642,856     1              
============================================================================================================
Trainable params: 138,357,544
Non-trainable params: 0
Total params: 138,357,544
------------------------------------------------------------------------------------------------------------
Model size (params + buffers): 527.79 Mb
Framework & CUDA overhead: 487.38 Mb
Total RAM usage: 1015.17 Mb
------------------------------------------------------------------------------------------------------------
Floating Point Operations on forward: 30.96 GFLOPs
Multiply-Accumulations on forward: 15.47 GMACs
Direct memory accesses on forward: 15.52 GDMAs
____________________________________________________________________________________________________________

Note: this feature is experimental and only supports highway nets for now (i.e. models with multiple branches or residual connections are not correctly supported in this PR)

Impacted Files	Coverage Δ
torchscan/utils.py	`67.39% <70.00%> (-0.47%)`	:arrow_down:
torchscan/crawler.py	`81.90% <93.75%> (+1.46%)`	:arrow_up:
torchscan/modules/__init__.py	`100.00% <100.00%> (ø)`
torchscan/modules/receptive.py	`100.00% <100.00%> (ø)`

frgfm / torch-scan

feat: Added experimental support of receptive field estimation #21

Codecov Report