clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.06k stars 273 forks source link

how to calculate multiplyaccumulate operations #105

Closed Liu-Tianchi closed 3 years ago

Liu-Tianchi commented 3 years ago

Dear author,

I found that you also provided the MAC in your paper. May I know what tool u used? As the 'ptflops' seems to be not work for this case. Thank you!

Liu-Tianchi commented 3 years ago

Dear author,

I found that you also provided the MAC in your paper. May I know what tool u used? As the 'ptflops' seems to be not work for this case. Thank you!

I use ptflops and get following resutls:

Computational complexity: 0.45 GMac
Number of parameters: 1.44 M

However, it also shows following info:

Warning: module Sigmoid is treated as a zero-op. Warning: module SELayer is treated as a zero-op. Warning: module SEBasicBlock is treated as a zero-op. Warning: module InstanceNorm1d is treated as a zero-op. Warning: module Spectrogram is treated as a zero-op. Warning: module MelScale is treated as a zero-op. Warning: module MelSpectrogram is treated as a zero-op. Warning: module ResNetSE is treated as a zero-op.

Some of the modules mentioned above actually have parameters and operations, like SEBasicBlock. Do you have any recommended tools or solution to get the accurate results ? Thank you!

Liu-Tianchi commented 3 years ago

Hi, I checked again, the result shown below is the same as that in paper. This issue is closed. Thank you!

ResNetSE( 1.437 M, 99.991% Params, 451.779 MMac, 100.000% MACs, (conv1): Conv2d(0.001 M, 0.055% Params, 3.167 MMac, 0.701% MACs, 1, 16, kernel_size=(7, 7), stride=(2, 1), padding=(3, 3), bias=False) (bn1): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (layer1): Sequential( 0.014 M, 0.992% Params, 57.207 MMac, 12.663% MACs, (0): SEBasicBlock( 0.005 M, 0.331% Params, 19.069 MMac, 4.221% MACs, (conv1): Conv2d(0.002 M, 0.160% Params, 9.308 MMac, 2.060% MACs, 16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.002 M, 0.160% Params, 9.308 MMac, 2.060% MACs, 16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.129 MMac, 0.029% MACs, inplace=True) (se): SELayer( 0.0 M, 0.006% Params, 0.065 MMac, 0.014% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.006% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.002% Params, 0.0 MMac, 0.000% MACs, in_features=16, out_features=2, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.003% Params, 0.0 MMac, 0.000% MACs, in_features=2, out_features=16, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (1): SEBasicBlock( 0.005 M, 0.331% Params, 19.069 MMac, 4.221% MACs, (conv1): Conv2d(0.002 M, 0.160% Params, 9.308 MMac, 2.060% MACs, 16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.002 M, 0.160% Params, 9.308 MMac, 2.060% MACs, 16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.129 MMac, 0.029% MACs, inplace=True) (se): SELayer( 0.0 M, 0.006% Params, 0.065 MMac, 0.014% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.006% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.002% Params, 0.0 MMac, 0.000% MACs, in_features=16, out_features=2, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.003% Params, 0.0 MMac, 0.000% MACs, in_features=2, out_features=16, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (2): SEBasicBlock( 0.005 M, 0.331% Params, 19.069 MMac, 4.221% MACs, (conv1): Conv2d(0.002 M, 0.160% Params, 9.308 MMac, 2.060% MACs, 16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.002 M, 0.160% Params, 9.308 MMac, 2.060% MACs, 16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.002% Params, 0.129 MMac, 0.029% MACs, 16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.129 MMac, 0.029% MACs, inplace=True) (se): SELayer( 0.0 M, 0.006% Params, 0.065 MMac, 0.014% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.006% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.002% Params, 0.0 MMac, 0.000% MACs, in_features=16, out_features=2, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.003% Params, 0.0 MMac, 0.000% MACs, in_features=2, out_features=16, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) ) (layer2): Sequential( 0.071 M, 4.967% Params, 71.299 MMac, 15.782% MACs, (0): SEBasicBlock( 0.015 M, 1.031% Params, 14.771 MMac, 3.269% MACs, (conv1): Conv2d(0.005 M, 0.321% Params, 4.654 MMac, 1.030% MACs, 16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.0 M, 0.020% Params, 0.033 MMac, 0.007% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.032 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.020% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.009% Params, 0.0 MMac, 0.000% MACs, in_features=32, out_features=4, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.011% Params, 0.0 MMac, 0.000% MACs, in_features=4, out_features=32, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) (downsample): Sequential( 0.001 M, 0.040% Params, 0.582 MMac, 0.129% MACs, (0): Conv2d(0.001 M, 0.036% Params, 0.517 MMac, 0.114% MACs, 16, 32, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): SEBasicBlock( 0.019 M, 1.312% Params, 18.843 MMac, 4.171% MACs, (conv1): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.0 M, 0.020% Params, 0.033 MMac, 0.007% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.032 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.020% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.009% Params, 0.0 MMac, 0.000% MACs, in_features=32, out_features=4, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.011% Params, 0.0 MMac, 0.000% MACs, in_features=4, out_features=32, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (2): SEBasicBlock( 0.019 M, 1.312% Params, 18.843 MMac, 4.171% MACs, (conv1): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.0 M, 0.020% Params, 0.033 MMac, 0.007% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.032 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.020% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.009% Params, 0.0 MMac, 0.000% MACs, in_features=32, out_features=4, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.011% Params, 0.0 MMac, 0.000% MACs, in_features=4, out_features=32, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (3): SEBasicBlock( 0.019 M, 1.312% Params, 18.843 MMac, 4.171% MACs, (conv1): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.009 M, 0.641% Params, 9.308 MMac, 2.060% MACs, 32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.004% Params, 0.065 MMac, 0.014% MACs, 32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.0 M, 0.020% Params, 0.033 MMac, 0.007% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.032 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.0 M, 0.020% Params, 0.0 MMac, 0.000% MACs, (0): Linear(0.0 M, 0.009% Params, 0.0 MMac, 0.000% MACs, in_features=32, out_features=4, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.0 M, 0.011% Params, 0.0 MMac, 0.000% MACs, in_features=4, out_features=32, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) ) (layer3): Sequential( 0.434 M, 30.216% Params, 109.351 MMac, 24.204% MACs, (0): SEBasicBlock( 0.059 M, 4.093% Params, 14.771 MMac, 3.269% MACs, (conv1): Conv2d(0.018 M, 1.283% Params, 4.7 MMac, 1.040% MACs, 32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, inplace=True) (se): SELayer( 0.001 M, 0.076% Params, 0.017 MMac, 0.004% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.016 MMac, 0.004% MACs, output_size=1) (fc): Sequential( 0.001 M, 0.076% Params, 0.001 MMac, 0.000% MACs, (0): Linear(0.001 M, 0.036% Params, 0.001 MMac, 0.000% MACs, in_features=64, out_features=8, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.001 M, 0.040% Params, 0.001 MMac, 0.000% MACs, in_features=8, out_features=64, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) (downsample): Sequential( 0.002 M, 0.151% Params, 0.555 MMac, 0.123% MACs, (0): Conv2d(0.002 M, 0.143% Params, 0.522 MMac, 0.116% MACs, 32, 64, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): SEBasicBlock( 0.075 M, 5.224% Params, 18.916 MMac, 4.187% MACs, (conv1): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, inplace=True) (se): SELayer( 0.001 M, 0.076% Params, 0.017 MMac, 0.004% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.016 MMac, 0.004% MACs, output_size=1) (fc): Sequential( 0.001 M, 0.076% Params, 0.001 MMac, 0.000% MACs, (0): Linear(0.001 M, 0.036% Params, 0.001 MMac, 0.000% MACs, in_features=64, out_features=8, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.001 M, 0.040% Params, 0.001 MMac, 0.000% MACs, in_features=8, out_features=64, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (2): SEBasicBlock( 0.075 M, 5.224% Params, 18.916 MMac, 4.187% MACs, (conv1): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, inplace=True) (se): SELayer( 0.001 M, 0.076% Params, 0.017 MMac, 0.004% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.016 MMac, 0.004% MACs, output_size=1) (fc): Sequential( 0.001 M, 0.076% Params, 0.001 MMac, 0.000% MACs, (0): Linear(0.001 M, 0.036% Params, 0.001 MMac, 0.000% MACs, in_features=64, out_features=8, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.001 M, 0.040% Params, 0.001 MMac, 0.000% MACs, in_features=8, out_features=64, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (3): SEBasicBlock( 0.075 M, 5.224% Params, 18.916 MMac, 4.187% MACs, (conv1): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, inplace=True) (se): SELayer( 0.001 M, 0.076% Params, 0.017 MMac, 0.004% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.016 MMac, 0.004% MACs, output_size=1) (fc): Sequential( 0.001 M, 0.076% Params, 0.001 MMac, 0.000% MACs, (0): Linear(0.001 M, 0.036% Params, 0.001 MMac, 0.000% MACs, in_features=64, out_features=8, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.001 M, 0.040% Params, 0.001 MMac, 0.000% MACs, in_features=8, out_features=64, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (4): SEBasicBlock( 0.075 M, 5.224% Params, 18.916 MMac, 4.187% MACs, (conv1): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, inplace=True) (se): SELayer( 0.001 M, 0.076% Params, 0.017 MMac, 0.004% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.016 MMac, 0.004% MACs, output_size=1) (fc): Sequential( 0.001 M, 0.076% Params, 0.001 MMac, 0.000% MACs, (0): Linear(0.001 M, 0.036% Params, 0.001 MMac, 0.000% MACs, in_features=64, out_features=8, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.001 M, 0.040% Params, 0.001 MMac, 0.000% MACs, in_features=8, out_features=64, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (5): SEBasicBlock( 0.075 M, 5.224% Params, 18.916 MMac, 4.187% MACs, (conv1): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.037 M, 2.565% Params, 9.4 MMac, 2.081% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.009% Params, 0.033 MMac, 0.007% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, inplace=True) (se): SELayer( 0.001 M, 0.076% Params, 0.017 MMac, 0.004% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.016 MMac, 0.004% MACs, output_size=1) (fc): Sequential( 0.001 M, 0.076% Params, 0.001 MMac, 0.000% MACs, (0): Linear(0.001 M, 0.036% Params, 0.001 MMac, 0.000% MACs, in_features=64, out_features=8, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.001 M, 0.040% Params, 0.001 MMac, 0.000% MACs, in_features=8, out_features=64, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) ) (layer4): Sequential( 0.834 M, 58.014% Params, 209.659 MMac, 46.408% MACs, (0): SEBasicBlock( 0.234 M, 16.310% Params, 58.789 MMac, 13.013% MACs, (conv1): Conv2d(0.074 M, 5.130% Params, 18.801 MMac, 4.161% MACs, 64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.147 M, 10.261% Params, 37.601 MMac, 8.323% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.004 M, 0.295% Params, 0.037 MMac, 0.008% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.004 M, 0.295% Params, 0.004 MMac, 0.001% MACs, (0): Linear(0.002 M, 0.144% Params, 0.002 MMac, 0.000% MACs, in_features=128, out_features=16, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.002 M, 0.151% Params, 0.002 MMac, 0.000% MACs, in_features=16, out_features=128, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) (downsample): Sequential( 0.008 M, 0.588% Params, 2.154 MMac, 0.477% MACs, (0): Conv2d(0.008 M, 0.570% Params, 2.089 MMac, 0.462% MACs, 64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): SEBasicBlock( 0.3 M, 20.852% Params, 75.435 MMac, 16.697% MACs, (conv1): Conv2d(0.147 M, 10.261% Params, 37.601 MMac, 8.323% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.147 M, 10.261% Params, 37.601 MMac, 8.323% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.004 M, 0.295% Params, 0.037 MMac, 0.008% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.004 M, 0.295% Params, 0.004 MMac, 0.001% MACs, (0): Linear(0.002 M, 0.144% Params, 0.002 MMac, 0.000% MACs, in_features=128, out_features=16, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.002 M, 0.151% Params, 0.002 MMac, 0.000% MACs, in_features=16, out_features=128, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) (2): SEBasicBlock( 0.3 M, 20.852% Params, 75.435 MMac, 16.697% MACs, (conv1): Conv2d(0.147 M, 10.261% Params, 37.601 MMac, 8.323% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn1): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(0.147 M, 10.261% Params, 37.601 MMac, 8.323% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(0.0 M, 0.018% Params, 0.065 MMac, 0.014% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(0.0 M, 0.000% Params, 0.065 MMac, 0.014% MACs, inplace=True) (se): SELayer( 0.004 M, 0.295% Params, 0.037 MMac, 0.008% MACs, (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.033 MMac, 0.007% MACs, output_size=1) (fc): Sequential( 0.004 M, 0.295% Params, 0.004 MMac, 0.001% MACs, (0): Linear(0.002 M, 0.144% Params, 0.002 MMac, 0.000% MACs, in_features=128, out_features=16, bias=True) (1): ReLU(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, inplace=True) (2): Linear(0.002 M, 0.151% Params, 0.002 MMac, 0.000% MACs, in_features=16, out_features=128, bias=True) (3): Sigmoid(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) ) ) ) (instancenorm): InstanceNorm1d(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, 40, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) (torchfb): MelSpectrogram( 0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, (spectrogram): Spectrogram(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) (mel_scale): MelScale(0.0 M, 0.000% Params, 0.0 MMac, 0.000% MACs, ) ) (sap_linear): Linear(0.017 M, 1.149% Params, 0.836 MMac, 0.185% MACs, in_features=128, out_features=128, bias=True) (fc): Linear(0.066 M, 4.596% Params, 0.066 MMac, 0.015% MACs, in_features=128, out_features=512, bias=True) ) Computational complexity: 0.4518 GMac Number of parameters: 1.4371 M