OpenPPL / ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Apache License 2.0
1.58k stars 236 forks source link

请问生成的量化onnx用netron打开,为什么不会显示视频中的量化算子 #475

Open pl9632008 opened 1 year ago

ZhangZhiPku commented 1 year ago

这跟您选择的导出平台有关,如果您选择了Onnxruntime作为导出平台,他会导出带QDQ节点的模型。 其余平台在推理时全部不使用OnnxQDQ节点传递量化信息,它们使用json。因此你会得到一个fp32的onnx文件,以及一份带有量化信息的json文件。

推理框架会读取这个json,并将fp32的onnx文件转换为int8的

pl9632008 commented 1 year ago

[06/02/2023-17:05:20] [TRT] [W] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32 or Bool. [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 458) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 459) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 461) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 462) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 465) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 466) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 468) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 469) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 916, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 472) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 474) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 918, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 477) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 478) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 924, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 931, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 543) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 544) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 546) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 547) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 550) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 551) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 553) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 554) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 975, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 557) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 559) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 977, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 562) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 563) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 983, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 990, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 628) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 629) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 631) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 632) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 635) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 636) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 638) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 639) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1034, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 642) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 644) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1036, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 647) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 648) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1042, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1049, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 700) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 701) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 703) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 704) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 706) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 707) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 709) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 710) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1093, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 712) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 714) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1095, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 716) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 717) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1101, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor 1108, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [06/02/2023-17:05:20] [TRT] [W] Missing scale and zero-point for tensor output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor

generate /root/ppq/working/int8-trt.engine

请问上述的warning是啥意思? 最终生成的int8-trt.engine能够推理,且和trtexec生成fp16的engine速度差不多; 是已经转换成功了吗?

ZhangZhiPku commented 1 year ago

是的,你已经转换成功了,TensorRT给出的警告表明了有一些tensor我们没有给出量化信息,这是合理的。你的模型结构可能比较复杂,对于一个复杂的模型而言,并不是所有算子/tensor都是可以量化的。 在PPQ中,我们实现了一系列调度器,它们负责找出模型中那些算子可以被量化,其余的算子将保持fp32精度参与运算。

你可以运行这样的语句: for op_name, op in graph.operations.items(): print(op_name, op.platform)

该语句将打印算子调度信息,您可以检查调度情况是否符合预期。

您也可以参考这个例子手动修改调度信息: https://github.com/openppl-public/ppq/blob/master/ppq/samples/Yolo/yolo_5.py

pl9632008 commented 1 year ago

非常感谢大佬认真回答!

pl9632008 commented 1 year ago

非常感谢大佬认真回答!