使用PaddleSlim进行离线静态量化并导出模型，使用PaddleLite进一步将量化模型转化为int8模型，使用Paddle导入int8模型报错

czp97 commented 2 weeks ago

请提出你的问题 Please ask your question

运行环境为： Kylinv10 OS Paddle 2.6.0 PaddleSlim 2.6.1 FT2000+ CPU 昆仑芯R200 XPU

原始模型为Pytorch导出的Resnet50转Paddle模型 PTQ代码如下：

paddleslim.quant.quant_post_static(
        executor = exe,
        model_dir=model_path，
        quantize_model_path=save_dir,
        data_loader=dataloader,
        model_filename=model_file,
        params_filename=params_file,
        batch_size=batch_size,
        batch_nums=10,
        algo='KL',
        round_type='round',
        onnx_format=True,
        save_model_filename=save_model_file,
        save_params_filename=save_params_file,
        quantizable_op_type=["conv2d", "depthwise_conv2d", "mul"],
        weight_bits=8, activation_bits=8,
        weight_quantize_type='channel_wise_abs_max',
        activation_quantize_type='moving_average_abs_max'
    )

PaddleLite opt转换代码如下：

opt = paddle.lite.Opt()
    opt.set_model_file("./raw-model/int8.pdmodel")
    opt.set_param_file("./raw-model/int8.pdiparams")
    opt.set_valid_places("xpu")
    opt.set_model_type("naive_buffer")
    opt.set_optimize_out("./int8")
    opt.run()

导出paddle-lite模型int8.nb、model、param

使用Paddle载入int8.nb模型时报错：

File "/workspace/test/model-compress/eval/infer_acc_test.py", line 121, in main
    model = init_xpu_predictor(args)
  File "/workspace/test/model-compress/eval/infer_acc_test.py", line 56, in init_xpu_predictor
    xpu_predictor = create_predictor(config)
RuntimeError: (NotFound) Cannot open file ../model/resnet50/int8.nb/__model__, please confirm whether the file is normal.
  [Hint: Expected static_cast<bool>(fin.is_open()) == true, but received static_cast<bool>(fin.is_open()):0 != true:1.] (at /workspace/Paddle/paddle/fluid/inference/api/analysis_predictor.cc:2577)

使用Paddle载入model、param时报错如下：

File "/workspace/test/model-compress/eval/infer_acc_test.py", line 121, in main
    model = init_xpu_predictor(args)
  File "/workspace/test/model-compress/eval/infer_acc_test.py", line 56, in init_xpu_predictor
    xpu_predictor = create_predictor(config)
RuntimeError: (NotFound) Operator (io_copy_once) is not registered.
  [Hint: op_info_ptr should not be null.] (at /workspace/Paddle/paddle/fluid/framework/op_info.h:152)

量化模型文件提取码：bvxo

westfish commented 2 weeks ago

.nb 文件的载入方式是不是有问题呢，是不是需要PaddleLite 的 API 进行加载和推理

czp97 commented 2 weeks ago

.nb 文件的载入方式是不是有问题呢，是不是需要PaddleLite 的 API 进行加载和推理

根据PaddleLite给的 PaddleLite-generic-demo.tar.gz 示例，可以使用Paddle载入nb模型并进行推理具体代码如下：

import paddle
import paddle.fluid as fluid

    if len(MODEL_FILE) == 0 and len(PARAMS_FILE) == 0:
        [program, feed_target_names,
         fetch_targets] = fluid.io.load_inference_model(model_dir, exe)
    else:
        [program, feed_target_names,
         fetch_targets] = fluid.io.load_inference_model(
             model_dir,
             exe,
             model_filename=MODEL_FILE,
             params_filename=PARAMS_FILE)

其中，model_dir是.nb文件所在路径，model_filename是paddllite生成的model和params文件路径

PaddlePaddle / Paddle

使用PaddleSlim进行离线静态量化并导出模型，使用PaddleLite进一步将量化模型转化为int8模型，使用Paddle导入int8模型报错 #69224

请提出你的问题 Please ask your question