ModelTC / MQBench

Model Quantization Benchmark
Apache License 2.0
766 stars 140 forks source link

MQBench的结果与SNPE DSP的结果不是位精确的 #109

Closed changewOw closed 2 years ago

changewOw commented 2 years ago

MQBench是一个非常有趣的项目。

环境 pytorch: 1.8.1 MQBench: branch main, e2175203 SNPE: snpe-1.61.0.3358

问题: 我用一个只有两层卷积模型做了一个简单的测试,比对MQBench 量化后的结果和SNPE DSP的结果,发现并不是位精确的,请问一下这是否是正常的,我是否有哪里做错了。

复现

class Net(nn.Module): def init(self): super(Net, self).init() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.conv = nn.Conv2d(3, 128,1,1, bias=True) self.conv2 = nn.Conv2d(128, 20,1,1,bias=True) self.relu = nn.ReLU() self.flat = nn.Flatten(1)

def forward(self, x): # (1,3,20,20)
    x = self.avg_pool(x)
    x = self.conv(x)
    x = self.conv2(x)
    x = self.flat(x)
    return x

SIZE = 20 backend = BackendType.SNPE

np.set_printoptions(suppress=True, precision=6) torch.set_printoptions(6) seed_torchv2(42)

def gen_inputdata(length=100): data = [] for in range(length): data.append(np.ones((1,3,SIZE,SIZE), dtype=np.float32) 0.1 np.random.randint(0, 10)) return np.stack(data, axis=0)

model = Net() # use vision pre-defined model model.eval()

train_data = gen_input_data(100) dummy_input = np.zeros((1,3,SIZE,SIZE), dtype=np.float32) + 0.5

print("pytorch fp32 result") print(model(torch.from_numpy(dummy_input.copy())).float())

quant

model = prepare_by_platform(model, backend)

enable_calibration(model)

for i, d in enumerate(traindata): = model(torch.from_numpy(d).float())

enable_quantization(model)

print("quant sim result") print(model(torch.from_numpy(dummy_input.copy())).float())

input_shape = {"image":[1,3,SIZE,SIZE]} convert_deploy(model, backend, input_shape)

save dummy input and test it on DSP

image = dummy_input.copy() assert image.shape == (1,3,SIZE,SIZE) assert image.dtype == np.float32 image.tofile("./tmp.raw") print("#" * 50)

pytorch fp32 result tensor([[-0.347889, -0.289117, -0.083191, -0.222827, 0.124699, 0.235278, 0.434433, -0.302174, -0.047763, 0.229472, -0.037784, 0.082496, -0.150852, -0.170281, 0.130777, 0.146441, -0.494992, -0.182881, 0.600709, -0.063706]], grad_fn=)

quant sim result tensor([[-0.344930, -0.290467, -0.081694, -0.222389, 0.131618, 0.231466, 0.435701, -0.299544, -0.049924, 0.226927, -0.036308, 0.081694, -0.149772, -0.172465, 0.131618, 0.149772, -0.494702, -0.181542, 0.599088, -0.063540]], grad_fn=



- DLC转换
`./snpe-onnx-to-dlc --input_network mqbench_qmodel_deploy_model.onnx --output_path tmp.dlc --quantization_overrides mqbench_qmodel_clip_ranges.json`
`./snpe-dlc-quantize --input_dlc tmp.dlc --input_list tmp_file.txt --output_dlc tmp_quat_mq.dlc --override_params --bias_bitwidth 32
`
tmp_file.txt和tmp_file_android.txt都只有一个文件就是tmp.raw,tmp.raw在上面python程序里面保存下来为一个3x20x20的float文件

- SNPE DSP run
`./snpe-net-run --container /sdcard/tmp_quat_mq.dlc --input_list /sdcard/tmp_file_android.txt --use_dsp`

##################################################
74.raw
(20,)
[-0.34493  **_-0.285929_** -0.081694 -0.222389  **_0.127079_**  **_0.236005_**  0.435701
 -0.299544 -0.049924  0.226927 -0.036308  0.081694 -0.149772 -0.172465
  0.131618  0.149772 **_-0.490163_** **_-0.177003_**  0.599088 _**-0.068078**_]

比对quant sim result 和 DSP 的结果,可以看到粗斜体是二者不一致的地方
Tracin commented 2 years ago

整个流程没有任何问题,事实上我们几乎不可能做到在pytorch里bit对齐后端硬件,有太多未知的运算细节. MQBench旨在量化模式,量化位置上尽力对齐后端运算. 我们一般以cosine指标来计算两者误差,0.99+可以认为训练部署的精度是有保证的.

changewOw commented 2 years ago

谢谢!

changewOw commented 2 years ago

@Tracin 我进一步测试了一下mobilenetv3-small(num_classes=2),发现有些样本quantsim的结果为[-0.6173174 -0.05346843] DSP的结果为[-0.792305 -0.490937], 他们的cosine为0.8923229。

想请问一下,有哪些地方可以改进来尽量对齐quantsim和DSP的结果,是不是mobilenetv3不适合来做量化

Tracin commented 2 years ago

可以先检查量化参数是否正确

changewOw commented 2 years ago

我的整个模型是UNet的模式,有三个输出 encoder: mobilenetv3-small decoder:upNearest2d->upNearest2d->upNearest2d

    def forward(self, x):
        feature_8x, feature_16x, feature_32x = self.model(x)

        logits_cls = self.head_cls(feature_32x)

        accu_radius, heatmaps_uv = self.decode_model(feature_8x, feature_16x, feature_32x)

        return logits_cls, accu_radius, heatmaps_uv

1.我检查了带量化节点的onnx,看上去没有问题。 2.看上去有点问题

        "1572": [
            {
                "bitwidth": 8,
                "min": -17.530067443847656,
                "max": 12.27104663848877
            }
        ],
        "1583": [
            {
                "bitwidth": 8,
                "min": -0.3967282474040985,
                "max": 12.248984336853027
            }
        ],
        "1591": [
            {
                "bitwidth": 8,
                "min": 0.0,
                "max": 5.7605085372924805
            }
        ],

DLC-quant的输出是:

[INFO] InitializeStderr: DebugLog initialized.
[INFO] Writing intermediate model
[INFO] Setting activation for layer: image and buffer: image
[INFO] bw: 8, min: 0.000000, max: 1.000000, delta: 0.003922, offset: 0.000000
[INFO] Setting activation for layer: Conv_6 and buffer: 1572
[INFO] bw: 8, min: -17.530067, max: 12.271047, delta: 0.116867, offset: -150.000000
[INFO] Setting activation for layer: Add_11_Hswish and buffer: 1583
[INFO] bw: 8, min: -0.359582, max: 17.979096, delta: 0.071916, offset: -5.000000
[INFO] Setting activation for layer: Conv_24 and buffer: 1590
[INFO] bw: 8, min: -18.765767, max: 4.577016, delta: 0.091540, offset: -205.000000
[INFO] Setting activation for layer: Relu_25 and buffer: 1591
[INFO] bw: 8, min: 0.000000, max: 5.760509, delta: 0.022590, offset: 0.000000
[INFO] Setting activation for layer: GlobalAveragePool_29 and buffer: 1595
[INFO] bw: 8, min: 0.000000, max: 2.837885, delta: 0.011129, offset: 0.000000

我发现几个问题: a)比如1583这个buffer,json给的是-0.39和12.24,而dlc给的是-0.35和17.97。 b)1583这个buffer,json没有给出delta和offset,dlc是自己算出来的吗 c)json文件里面没有出现1590这个buffer

我把onnx json和dlc的log放在了zip文件里面,如果可以帮忙看看,谢谢

detnet_center_unet.zip

Tracin commented 2 years ago

可以尝试:

github-actions[bot] commented 2 years ago

This issue has not received any updates in 120 days. Please reply to this issue if this still unresolved!