Closed thehalocline closed 3 years ago
VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (0_linear_quant): LinearQuant(sf=2, bits=8, overflow_rate=0.000, counter=0) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (2_linear_quant): LinearQuant(sf=1, bits=8, overflow_rate=0.000, counter=0) (3): ReLU(inplace=True) (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (5_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (7_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (8): ReLU(inplace=True) (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (10_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (12_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (14_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (15): ReLU(inplace=True) (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (17_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (18): ReLU(inplace=True) (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (19_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (21_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (22): ReLU(inplace=True) (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (24_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (25): ReLU(inplace=True) (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (26_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (27): ReLU(inplace=True) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (28_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (29): ReLU(inplace=True) (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ) (classifier): Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (0_linear_quant): LinearQuant(sf=1, bits=8, overflow_rate=0.000, counter=0) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (3_linear_quant): LinearQuant(sf=2, bits=8, overflow_rate=0.000, counter=0) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) (6): Linear(in_features=4096, out_features=1000, bias=True) (6_linear_quant): LinearQuant(sf=1, bits=8, overflow_rate=0.000, counter=0) ) )
in features.0 layer, weights are 8 bits and ifmap is 32 bits. Linear_quant layer is only generate after convolution layer so I don't know how the activation quantization works. When do you quantize first input of NET?
VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (0_linear_quant): LinearQuant(sf=2, bits=8, overflow_rate=0.000, counter=0) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (2_linear_quant): LinearQuant(sf=1, bits=8, overflow_rate=0.000, counter=0) (3): ReLU(inplace=True) (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (5_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (7_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (8): ReLU(inplace=True) (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (10_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (12_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (14_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (15): ReLU(inplace=True) (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (17_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (18): ReLU(inplace=True) (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (19_linear_quant): LinearQuant(sf=-1, bits=8, overflow_rate=0.000, counter=0) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (21_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (22): ReLU(inplace=True) (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (24_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (25): ReLU(inplace=True) (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (26_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (27): ReLU(inplace=True) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (28_linear_quant): LinearQuant(sf=0, bits=8, overflow_rate=0.000, counter=0) (29): ReLU(inplace=True) (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ) (classifier): Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (0_linear_quant): LinearQuant(sf=1, bits=8, overflow_rate=0.000, counter=0) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (3_linear_quant): LinearQuant(sf=2, bits=8, overflow_rate=0.000, counter=0) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) (6): Linear(in_features=4096, out_features=1000, bias=True) (6_linear_quant): LinearQuant(sf=1, bits=8, overflow_rate=0.000, counter=0) ) )
in features.0 layer, weights are 8 bits and ifmap is 32 bits. Linear_quant layer is only generate after convolution layer so I don't know how the activation quantization works. When do you quantize first input of NET?